From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#71094: [PATCH] Prefer to run find and grep in parallel in rgrep Date: Wed, 22 May 2024 18:26:45 +0300 Message-ID: <86wmnl6f62.fsf@gnu.org> References: <86ttiq6or8.fsf@gnu.org> <8aedd0ed-58fe-4ac7-98d6-950be2d4700b@gutov.dev> <868r026jlq.fsf@gnu.org> <861q5t7vrp.fsf@gnu.org> <10f62497-dfb1-4c46-b18a-6d1100de4b6a@gutov.dev> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="11123"; mail-complaints-to="usenet@ciao.gmane.io" Cc: sbaugh@janestreet.com, 71094@debbugs.gnu.org, rgm@gnu.org To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed May 22 17:28:22 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1s9nt7-0002jE-SB for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 22 May 2024 17:28:22 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1s9nsl-0002et-Mo; Wed, 22 May 2024 11:28:00 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s9nsk-0002dD-JM for bug-gnu-emacs@gnu.org; Wed, 22 May 2024 11:27:58 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1s9nsj-0005qG-1D for bug-gnu-emacs@gnu.org; Wed, 22 May 2024 11:27:58 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1s9nso-0004Ne-Hb for bug-gnu-emacs@gnu.org; Wed, 22 May 2024 11:28:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 22 May 2024 15:28:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 71094 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 71094-submit@debbugs.gnu.org id=B71094.171639162916825 (code B ref 71094); Wed, 22 May 2024 15:28:02 +0000 Original-Received: (at 71094) by debbugs.gnu.org; 22 May 2024 15:27:09 +0000 Original-Received: from localhost ([127.0.0.1]:56658 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1s9nrw-0004NJ-H1 for submit@debbugs.gnu.org; Wed, 22 May 2024 11:27:08 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:38428) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1s9nrs-0004Mt-8O for 71094@debbugs.gnu.org; Wed, 22 May 2024 11:27:07 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s9nrf-0005hz-Cx; Wed, 22 May 2024 11:26:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=7oSdqFsEHKK47UwzpdrnEjie/EJlJ+UT7R0PuVsBCys=; b=se02QEli8gQS 10b1wzWpmAFb6eWPZzFlwnZIFdk5a802hIl7HNV8/Jc/CTZ3h0BBHLybhrkajStfCvYjuLo7SP9jW qABEpWKH19GT2Kw6VFLvGdQSq9ZJRzKOqp176XDUZ1F/ClTdqlW/5S6dFSllaz78fWqc6NfPx27LV EZQ8qhhdt6f/O7OdCydhL87viqDyHn68n2zDU3QQ06AeeULyXFJg9BSy32jOGpeKKng9UdIUzG5xd bD08O5OQAyF8FlO1ED+cOxkDder1kK8S8x01plblhAL+WY3hysVYlpoyfLPi0XKMQC1aeErAO++Rc e1cuUplU9e9mJ9ZBA1hsfA==; In-Reply-To: <10f62497-dfb1-4c46-b18a-6d1100de4b6a@gutov.dev> (message from Dmitry Gutov on Wed, 22 May 2024 17:50:42 +0300) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:285637 Archived-At: > Date: Wed, 22 May 2024 17:50:42 +0300 > Cc: sbaugh@janestreet.com, 71094@debbugs.gnu.org, rgm@gnu.org > From: Dmitry Gutov > > >> Whereas in the Emacs repository "find ... -print0 | wc" reports 202928 > >> characters. Meaning, it uses just 1.5 'grep' invocations. To see better > >> parallelism there we'll need to either lower the limit or test it in a > >> project at least twice as big. > > > > ...until xargs collects all those characters, it will not invoke grep, > > right? So, for directories whose file names total less than those > > 200K, xargs will still wait until find ends its job, right? > > That's right. And it's why we're not seeing much of a difference in > projects of Emacs's size or smaller. No apparent regression either, though. But we added xargs to the soup. On GNU/Linux, where GNU Findutils are developed, it probably isn't a problem. On other systems, not necessarily... > >> So here is another example: a Linux kernel checkout (76K files). Also > >> about 30% improvement: 1.40s vs 2.00s. > > > > This is all highly system-dependent. > > Naturally. So it'd be great to see some additional data points from > users on other systems. > > Especially those where the default limit is lower than it is on mine. I'd be happy if someone could time these methods on MS-Windows and on some *BSD system, at least. Bonus points for macOS.