From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#71094: [PATCH] Prefer to run find and grep in parallel in rgrep Date: Wed, 22 May 2024 17:50:42 +0300 Message-ID: <10f62497-dfb1-4c46-b18a-6d1100de4b6a@gutov.dev> References: <86ttiq6or8.fsf@gnu.org> <8aedd0ed-58fe-4ac7-98d6-950be2d4700b@gutov.dev> <868r026jlq.fsf@gnu.org> <861q5t7vrp.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="2347"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla Thunderbird Cc: sbaugh@janestreet.com, 71094@debbugs.gnu.org, rgm@gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed May 22 16:51:31 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1s9nJS-0000NU-Ka for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 22 May 2024 16:51:31 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1s9nIv-0003Rs-H4; Wed, 22 May 2024 10:50:57 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s9nIu-0003Re-Fz for bug-gnu-emacs@gnu.org; Wed, 22 May 2024 10:50:56 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1s9nIu-0006wz-6u for bug-gnu-emacs@gnu.org; Wed, 22 May 2024 10:50:56 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1s9nIz-0003rL-Vi for bug-gnu-emacs@gnu.org; Wed, 22 May 2024 10:51:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 22 May 2024 14:51:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 71094 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 71094-submit@debbugs.gnu.org id=B71094.171638945914829 (code B ref 71094); Wed, 22 May 2024 14:51:01 +0000 Original-Received: (at 71094) by debbugs.gnu.org; 22 May 2024 14:50:59 +0000 Original-Received: from localhost ([127.0.0.1]:56387 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1s9nIw-0003r7-J2 for submit@debbugs.gnu.org; Wed, 22 May 2024 10:50:58 -0400 Original-Received: from fhigh7-smtp.messagingengine.com ([103.168.172.158]:59819) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1s9nIu-0003r1-GG for 71094@debbugs.gnu.org; Wed, 22 May 2024 10:50:57 -0400 Original-Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 781C71140192; Wed, 22 May 2024 10:50:45 -0400 (EDT) Original-Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Wed, 22 May 2024 10:50:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1716389445; x=1716475845; bh=s1Dij0eugnhR43c2DtGd+4b3xenJ9eFS7s0V/Iyg9vQ=; b= jMqRWJ+Gg/vuy172V73snJBe2J5MB0vlpUrPptqblT7/apeHVGo/ZbgvAhpNn7wm JI44c0WpqLrb2zvtnDIidUGTSBu3iw78ns6tr3dDm2+JGdmSnXmtq6/av/fj+fEl Ztm9Msx7awwSpqxPbmeZU5XBea0+ShE6QCMetprxK1wATNZ3klYDu7xW/20UJBum ffOVtysxhfBBSNTeukWUpfdQhM/9zXRhYP52D8ZYZAbdo9r634mNnptN4gf9NtzE IhhUWa32EO06lyLbS5VNRVVoU0Ke7b3t3yDIf2sFowF/8I7l15VzlL8eBWJcpSXH qIaGXjPCDXQibzkyxsi1oQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1716389445; x= 1716475845; bh=s1Dij0eugnhR43c2DtGd+4b3xenJ9eFS7s0V/Iyg9vQ=; b=H EvgsA23ZV0woozK8dvBp3Ve3Umau0nlYHI6pBuwt3YW35Ol3OjrDf49Uuy/fKCGo ea/8uElj+083MJEDvKMqrHPxyHRamhP289yvm9SafcB8cAHrF0AzaKjo4xKg5SLR rz4fSzWTCQy2ehdGvd6vpFNUFX4doAqeL7nir9u+NqOwuPv7jmJOlLWCTHFfQupI TxjkM5tXZZUGxidOfcknUMjCn0eQfzhMrbGmbZsgIiEE1WnVGcgb+OszczTHPRen CR6js47gpEkzxL9j7fZp6+7AqbcFkWEAj0GMNhqIdgIMqODY0Ra6QKhgM5EgywpM MK49AkvAeZb1tYdSweNmQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrvdeigedgvddtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdejnecuhfhrohhmpeffmhhi thhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdruggvvheqnecuggftrfgrth htvghrnhepteduleejgeehtefgheegjeekueehvdevieekueeftddvtdevfefhvdevgedu jeehnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepug hmihhtrhihsehguhhtohhvrdguvghv X-ME-Proxy: Feedback-ID: i0e71465a:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 22 May 2024 10:50:44 -0400 (EDT) Content-Language: en-US In-Reply-To: <861q5t7vrp.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:285634 Archived-At: On 22/05/2024 17:42, Eli Zaretskii wrote: >>> That's true, but what is your mental model of how the pipe with xargs >>> works in practice? How many invocations of grep will xargs do, and >>> when will the first invocation happen? >> >> In my mental model xargs acts like an asynchronous queue with batch >> processing. The first invocation will happen after the output reaches >> the maximum line number of maximum number of arguments configured. They >> are system-dependent by default. > > And can be rather small. But if it is large, then... > >> For example, on my system 'xargs --show-limits' says >> >> Size of command buffer we are actually using: 131072 >> >> Whereas in the Emacs repository "find ... -print0 | wc" reports 202928 >> characters. Meaning, it uses just 1.5 'grep' invocations. To see better >> parallelism there we'll need to either lower the limit or test it in a >> project at least twice as big. > > ...until xargs collects all those characters, it will not invoke grep, > right? So, for directories whose file names total less than those > 200K, xargs will still wait until find ends its job, right? That's right. And it's why we're not seeing much of a difference in projects of Emacs's size or smaller. No apparent regression either, though. >> So here is another example: a Linux kernel checkout (76K files). Also >> about 30% improvement: 1.40s vs 2.00s. > > This is all highly system-dependent. Naturally. So it'd be great to see some additional data points from users on other systems. Especially those where the default limit is lower than it is on mine.