From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Date: Mon, 24 Jul 2023 15:55:13 +0300 Message-ID: <937c3b8e-7742-91b7-c2cf-4cadd0782f0c@gutov.dev> References: <1fd5e3ed-e1c3-5d6e-897f-1d5d55e379fa@gutov.dev> <87wmyupvlw.fsf@localhost> <5c4d9bea-3eb9-b262-138a-4ea0cb203436@gutov.dev> <87tttypp2e.fsf@localhost> <87r0p030w0.fsf@yahoo.com> <83sf9f6wm0.fsf@gnu.org> <83sf9eub9d.fsf@gnu.org> <2d844a34-857d-3d59-b897-73372baac480@gutov.dev> <83bkg2tsu6.fsf@gnu.org> <83bd4246-ac41-90ec-1df3-02d0bd59ca44@gutov.dev> <834jlttv1p.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35894"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, 64735@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Jul 24 14:56:34 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qNv72-00098B-6k for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 24 Jul 2023 14:56:33 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qNv6a-0000qM-U5; Mon, 24 Jul 2023 08:56:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qNv6Y-0000pe-DW for bug-gnu-emacs@gnu.org; Mon, 24 Jul 2023 08:56:02 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qNv6Y-0007vQ-5W for bug-gnu-emacs@gnu.org; Mon, 24 Jul 2023 08:56:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qNv6Y-0006sf-0d for bug-gnu-emacs@gnu.org; Mon, 24 Jul 2023 08:56:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 24 Jul 2023 12:56:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64735 X-GNU-PR-Package: emacs Original-Received: via spool by 64735-submit@debbugs.gnu.org id=B64735.169020332526382 (code B ref 64735); Mon, 24 Jul 2023 12:56:01 +0000 Original-Received: (at 64735) by debbugs.gnu.org; 24 Jul 2023 12:55:25 +0000 Original-Received: from localhost ([127.0.0.1]:42086 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qNv5x-0006rS-CP for submit@debbugs.gnu.org; Mon, 24 Jul 2023 08:55:25 -0400 Original-Received: from out3-smtp.messagingengine.com ([66.111.4.27]:44141) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qNv5u-0006rA-8h for 64735@debbugs.gnu.org; Mon, 24 Jul 2023 08:55:24 -0400 Original-Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 1C2775C0182; Mon, 24 Jul 2023 08:55:17 -0400 (EDT) Original-Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Mon, 24 Jul 2023 08:55:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm2; t= 1690203317; x=1690289717; bh=+UK8rdOiJ+qouJOeYUXXhbJJVrZFhFeylYi g/OjKxVU=; b=JVx1O7Uq+n65oMZb6YgbjYlePyHK6/BW08zYocmm6tcN1rOndAR xQy+t70v80vtwIO2eCk0w6g4nGVpCommCcGjW9fz1qtAGZu7WllKSutRRE5DpJ0r LW1GoHJhv8vJ7TEztPmrlktNDAeIskXQba3zeF7k3T/bW8TBVPOdIxCrGS9UI1VN /8yLurUX4RRL09DzLYOolvMoz3oKZlXjESqW7YL3/xB9vQ8psEdE5ap2W2Ho+VOb AwwbqpgWh25XyN7okRnTQDNG6VSqUF0XJKD0yK4NyrDsvEGZBNwv1MNtj+ANhVoV uqzKLl0qrKJytf3FDt1h+aglu8cFezlaXZA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1690203317; x=1690289717; bh=+UK8rdOiJ+qouJOeYUXXhbJJVrZFhFeylYi g/OjKxVU=; b=nX+RgStoztPdk+MDbt2OiY3I66SJh0/jZ9zwKhCfL5IBgEiLgfU kWtta3PIPHycEqQmP0Hi5WNOf0h/ITLM3JxmG7sClRDrnL5iO1HBBVaj/36T3vP6 ggf/R5DMv+wFdo94eJYac8jo/0hsUrfDY6Hh8u3jwA9y1hiC027Ek4uGR/G/MXvn lMqwR82/vwlW0GG2lIiu6FoHO7RcpMngfoBBQbOESW3WyYiNSYLTN/WnNsTPZuAg jzpuLyKE+2nmqzYCGKrv6hAjf8VVCgCOyAAF5hC7OEmFEMPkAutaRKmFWdQG9SJW MoSLYoxSfJb/uGNOzNHI+DM4XABulof7Lig== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrheekgdehiecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefkffggfgfuvfevfhfhjggtgfesthejredttdefjeenucfhrhhomhepffhmihht rhihucfiuhhtohhvuceoughmihhtrhihsehguhhtohhvrdguvghvqeenucggtffrrghtth gvrhhnpeeigfetveehveevffehledtueekieeikeeufeegudfgfeeghfdulefgfeevledv veenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegumh hithhrhiesghhuthhovhdruggvvh X-ME-Proxy: Feedback-ID: i0e71465a:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 24 Jul 2023 08:55:15 -0400 (EDT) Content-Language: en-US In-Reply-To: <834jlttv1p.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:265962 Archived-At: On 24/07/2023 14:20, Eli Zaretskii wrote: >> Date: Sun, 23 Jul 2023 22:27:26 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov >> >> On 23/07/2023 20:56, Eli Zaretskii wrote: >>>> And, ideally, do all the relevant benchmarking when proposing the change. >>> Of course. Although the benchmarks until now already show quite a >>> variability. >> >> Speaking of your MS Windows results that are unflattering to 'find', it >> might be worth it to do a more varied comparison, to determine the >> OS-specific bottleneck. >> >> Off the top of my head, here are some possibilities: >> >> 1. 'find' itself is much slower there. There is room for improvement in >> the port. > > I think it's the filesystem, not the port (which I did myself in this > case). But directory-files-recursively goes through the same filesystem, doesn't it? > But I'd welcome similar tests on other Windows systems with > other ports of Find. Just remember to measure this particular > benchmark, not just Find itself from the shell, as the times are very > different (as I reported up-thread). Concur. >> 2. The process output handling is worse. > > Not sure what that means. Emacs's ability to process the output of a process on the particular platform. You said: Btw, the Find command with pipe to some other program, like wc, finishes much faster, like 2 to 4 times faster than when it is run from find-directory-files-recursively. That's probably the slowdown due to communications with async subprocesses in action. One thing to try it changing the -with-find implementation to use a synchronous call, to compare (e.g. using 'process-file'). And repeat these tests on GNU/Linux too. That would help us gauge the viability of using an asynchronous process to get the file listing. But also, if one was just looking into reimplementing directory-files-recursively using 'find' (to create an endpoint with swappable implementations, for example), 'process-file' is a suitable substitute because the original is also currently synchronous. >> 3. Something particular to the project being used for the test. > > I don't think I understand this one. This described the possibility where the disparity between the implementations' runtimes was due to something unusual in the project structure, if you tested different projects between Windows and GNU/Linux, making direct comparison less useful. It's the least likely cause, but still sometimes a possibility. >> To look into the possibility #1, you can try running the same command in >> the terminal with the output to NUL and comparing the runtime to what's >> reported in the benchmark. > > Output to the null device is a bad idea, as (AFAIR) Find is clever > enough to detect that and do nothing. I run "find | wc" instead, and > already reported that it is much faster. Now I see it, thanks. >> I actually remember, from my time on MS Windows about 10 years ago, that >> some older ports of 'find' and/or 'grep' did have performance problems, >> but IIRC ezwinports contained the improved versions. > > The ezwinports is the version I'm using here. But maybe someone came > up with a better one: after all, I did my port many years ago (because > the native ports available back then were abysmally slow). We should also look at the exact numbers. If you say that "| wc" invocation is 2-4x faster than what's reported in the benchmark, then it takes about 2-4 seconds. Which is still oddly slower than your reported numbers for directory-files-recursively.