From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Date: Sat, 22 Jul 2023 14:58:46 +0300 Message-ID: <83sf9g88eh.fsf@gnu.org> References: <874jlxebz5.fsf@gmx.de> <87lef9mqio.fsf@localhost> <87edl1scbw.fsf@gmx.de> <87fs5hmp6i.fsf@localhost> <87cz0lmoxy.fsf@localhost> <83v8edzb31.fsf@gnu.org> <87r0p1cta3.fsf@gmx.de> <87pm4ll7ox.fsf@localhost> <87a5vpcmc7.fsf@gmx.de> <878rb9l1f5.fsf@localhost> <87zg3pb6yt.fsf@gmx.de> <83zg3p9s39.fsf@gnu.org> <878rb944wi.fsf@localhost> <83tttx9q4v.fsf@gnu.org> <87pm4lb4fr.fsf@gmx.de> <83pm4l9n0o.fsf@gnu.org> <87jzutb14l.fsf@gmx.de> <83mszp9kl2.fsf@gnu.org> <83h6pwa52z.fsf@gnu.org> <87ilaci637.fsf@catern.com> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14622"; mail-complaints-to="usenet@ciao.gmane.io" Cc: sbaugh@janestreet.com, yantar92@posteo.net, rms@gnu.org, dmitry@gutov.dev, michael.albinus@gmx.de, 64735@debbugs.gnu.org To: sbaugh@catern.com Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Jul 22 13:59:27 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qNBGg-0003gR-Nr for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 22 Jul 2023 13:59:26 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qNBGL-0003K6-NL; Sat, 22 Jul 2023 07:59:05 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qNBGJ-0003JQ-Iy for bug-gnu-emacs@gnu.org; Sat, 22 Jul 2023 07:59:03 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qNBGI-0007ha-RV for bug-gnu-emacs@gnu.org; Sat, 22 Jul 2023 07:59:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qNBGI-0004DI-FU for bug-gnu-emacs@gnu.org; Sat, 22 Jul 2023 07:59:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 22 Jul 2023 11:59:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64735 X-GNU-PR-Package: emacs Original-Received: via spool by 64735-submit@debbugs.gnu.org id=B64735.169002710416141 (code B ref 64735); Sat, 22 Jul 2023 11:59:02 +0000 Original-Received: (at 64735) by debbugs.gnu.org; 22 Jul 2023 11:58:24 +0000 Original-Received: from localhost ([127.0.0.1]:35588 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qNBFg-0004CH-3t for submit@debbugs.gnu.org; Sat, 22 Jul 2023 07:58:24 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:50762) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qNBFd-0004C1-E1 for 64735@debbugs.gnu.org; Sat, 22 Jul 2023 07:58:22 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qNBFX-0007bw-8W; Sat, 22 Jul 2023 07:58:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=84MqFbzxw4wvGtn4ZxLkS/dj2BZ3tK/v8PlCYFhzn5A=; b=lLUWEPZ9KhHh tlja7emkSn/e2lniSK3zAJXA5Zh+7w5vzqRmxLi6rMtIHx++HzNN7nYU74cnuYTwJDzZpLKFhZvGB rZ3k5ap1cYCYc6REjKrn8zh1TGTA6Pcy+fhyvAxF4KLPA81ZhmcgEPbypdyvf9ww1HCI0Y76/x2tl 0UwMhERUgiA2n92gfrpRbJff0RcPD9XJeitlRnO0K8eSlA09KNfTEvgDR36q7AnCKNohqOIlwHeKH S8pLEk08lLIOGoi0f1Eeel8K8hDykjvGPjMV2XA24l/Bnq+HkxG5sDM9u5I6lVOZs3QfsjOoPdxZU j1o0I/3aIfCgwaQYy025jQ==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qNBFR-0006sJ-5p; Sat, 22 Jul 2023 07:58:09 -0400 In-Reply-To: <87ilaci637.fsf@catern.com> (sbaugh@catern.com) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:265803 Archived-At: > From: sbaugh@catern.com > Date: Sat, 22 Jul 2023 10:38:37 +0000 (UTC) > Cc: Spencer Baugh , dmitry@gutov.dev, > yantar92@posteo.net, michael.albinus@gmx.de, rms@gnu.org, > 64735@debbugs.gnu.org > > Eli Zaretskii writes: > > No, the first step is to use in Emacs what Find does today, because it > > will already be a significant speedup. > > Why bother? directory-files-recursively is a rarely used API, as you > have mentioned before in this thread. Because we could then use it much more (assuming the result will be performant enough -- this remains to be seen). > And there is a way to speed it up which will have a performance boost > which is unbeatable any other way: Use find instead of > directory-files-recursively, and operate on files as they find prints > them. Not every command can operate on the output sequentially: some need to see all of the output, others will need to be redesigned and reimplemented to support such sequential mode. Moreover, piping from Find incurs overhead: data is broken into blocks by the pipe or PTY, reading the data can be slowed down if Emacs is busy processing something, etc. So I think a primitive that traverses the tree and produces file names with or without attributes, and can call some callback if needed, still has its place. > Since this runs the directory traversal in parallel with Emacs, it > has a speed advantage that is impossible to match in > directory-files-recursively. See above: you have an optimistic view of what actually happens in the relevant use cases. > We can fall back to directory-files-recursively when find is not > available. Find is already available today on many platforms, and we are evidently not happy enough with the results. That is the trigger for this discussion, isn't it? We are talking about ways to improve the performance, and I think having our own primitive that can do it is one such way, or at least it is not clear that it cannot be such a way. > > Optimizing the case of a long > > list of omissions should come later, as it is a minor optimization. > > This seems wrong. directory-files-recursively is rarely used, and rgrep > is a very popular command, and this problem with find makes rgrep around > ~10x slower by default. How in any world is that a minor optimization? > Most Emacs users will never realize that they can speed up rgrep > massively by setting grep-find-ignored-files to nil. Indeed, no-one > realized that until I just pointed it out. In my experience, they just > stop using rgrep in favor of other third-party packages like ripgrep, > because "grep is slow". Making grep-find-ignored-files smaller is independent of this particular issue. If we can make it shorter, we should.