From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Spencer Baugh Newsgroups: gmane.emacs.bugs Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Date: Thu, 20 Jul 2023 09:43:59 -0400 Message-ID: References: <837cqv41ob.fsf@gnu.org> <87mszqixhh.fsf@catern.com> <4d4f029f-f32f-13df-ffc3-3952d62d8bb3@gutov.dev> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="21864"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: sbaugh@catern.com, Eli Zaretskii , 64735@debbugs.gnu.org To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Jul 20 15:45:19 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qMTy3-0005QX-46 for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 20 Jul 2023 15:45:19 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qMTxo-0006G2-CA; Thu, 20 Jul 2023 09:45:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qMTxm-0006Fo-Mm for bug-gnu-emacs@gnu.org; Thu, 20 Jul 2023 09:45:02 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qMTxm-0000x7-Ey for bug-gnu-emacs@gnu.org; Thu, 20 Jul 2023 09:45:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qMTxl-00027P-RJ for bug-gnu-emacs@gnu.org; Thu, 20 Jul 2023 09:45:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Spencer Baugh Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 20 Jul 2023 13:45:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64735 X-GNU-PR-Package: emacs Original-Received: via spool by 64735-submit@debbugs.gnu.org id=B64735.16898606468065 (code B ref 64735); Thu, 20 Jul 2023 13:45:01 +0000 Original-Received: (at 64735) by debbugs.gnu.org; 20 Jul 2023 13:44:06 +0000 Original-Received: from localhost ([127.0.0.1]:58037 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qMTws-000261-EF for submit@debbugs.gnu.org; Thu, 20 Jul 2023 09:44:06 -0400 Original-Received: from mxout5.mail.janestreet.com ([64.215.233.18]:34341) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qMTwq-00025Q-Ez for 64735@debbugs.gnu.org; Thu, 20 Jul 2023 09:44:05 -0400 In-Reply-To: <4d4f029f-f32f-13df-ffc3-3952d62d8bb3@gutov.dev> (Dmitry Gutov's message of "Thu, 20 Jul 2023 15:42:59 +0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:265598 Archived-At: Dmitry Gutov writes: > On 20/07/2023 15:22, sbaugh@catern.com wrote: >>> I'm not sure we should bother more than these two simple measures. >> Unfortunately those two simple measures help rgrep but they don't help >> project-find-regexp (and others project.el commands using >> project--files-in-directory such as project-find-file), since those >> project commands pull their ignores from the version control system >> through vc (not grep-find-ignored-files), and then pass them to find. > > That's only a problem when the default file listing logic is used (and > we usually delegate to something like 'git ls-files' instead, when the > vc-aware backend is used). Hm, yes, but things like C-u project-find-regexp will use the default find-based file listing logic instead of git ls-files, as do a few other things. I wonder, could we just go ahead and make a vc function which is list-files(GLOBS) and returns a list of files? Both git and hg support this. Then we could have C-u project-find-regexp use that instead of find, by taking the cross product of dirs-to-search and file-name-patterns-to-search. (And this would let me delete a big chunk of my own project backend, so I'd be happy to implement it.) Fundamentally it seems a little silly for project-ignores to ever be used for a vc project; if the vcs gives us ignores, we can probably just ask the vcs to list the files too, and it will have an efficient implementation of that. If we do that uniformly, then this find slowness would only affect transient projects, and transient projects pull their ignores from grep-find-ignored-files just like rgrep, so improvements will more easily be applied to both. (And maybe we could even get rid of project-ignores entirely, then?) > Anyway, some optimization could be useful there too. The extra > difficulty, though, is that the entries in IGNORES already can come as > wildcards. Can we merge several wildcards? Though I suppose if we use > a regexp, we could construct an alternation anyway. > > Another question it would be helpful to check, is whether the > different versions of 'find' out there work fine with -regex instead > of -name, and don't get slowed down simply because of that > feature. The old built-in 'find' on macOS, for example.