From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max Date: Thu, 21 Sep 2023 17:37:23 +0300 Message-ID: References: <83bkfs2tw5.fsf@gnu.org> <18a0b4d8-32bd-3ecd-8db4-32608a1ebba7@gutov.dev> <83il8lxjcu.fsf@gnu.org> <2e21ec81-8e4f-4c02-ea15-43bd6da3daa7@gutov.dev> <8334zmtwwi.fsf@gnu.org> <83tts0rkh5.fsf@gnu.org> <831qf3pd1y.fsf@gnu.org> <28a7916e-92d5-77ab-a61e-f85b59ac76b1@gutov.dev> <83sf7jnq0m.fsf@gnu.org> <5c493f86-0af5-256f-41a7-7d886ab4c5e4@gutov.dev> <83ledanvzw.fsf@gnu.org> <83r0n2m7qz.fsf@gnu.org> <26afa109-9ba3-78a3-0e68-7585ae8e3a19@gutov.dev> <83il8dna30.fsf@gnu.org> <83bke5mhvs.fsf@gnu.org> <83a5tmk79p.fsf@gnu.org> <937d9927-506f-aa36-94e9-3cceb8f629dd@gutov.dev> <83zg1hay6q.fsf@gnu.org> <451d6012-e5ab-df6c-50e3-dac20b91781c@gutov.dev> <83led09dlk.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="38931"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Cc: 66020@debbugs.gnu.org, stefankangas@gmail.com, monnier@iro.umontreal.ca To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Sep 21 16:38:16 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qjKoq-0009t8-2y for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 21 Sep 2023 16:38:16 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qjKoY-0007Ii-BR; Thu, 21 Sep 2023 10:37:58 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qjKoU-0007Ia-NO for bug-gnu-emacs@gnu.org; Thu, 21 Sep 2023 10:37:54 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qjKoR-0007cT-SS for bug-gnu-emacs@gnu.org; Thu, 21 Sep 2023 10:37:52 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qjKob-00076K-SP for bug-gnu-emacs@gnu.org; Thu, 21 Sep 2023 10:38:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 21 Sep 2023 14:38:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 66020 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 66020-submit@debbugs.gnu.org id=B66020.169530706927277 (code B ref 66020); Thu, 21 Sep 2023 14:38:01 +0000 Original-Received: (at 66020) by debbugs.gnu.org; 21 Sep 2023 14:37:49 +0000 Original-Received: from localhost ([127.0.0.1]:34598 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qjKoO-00075t-MQ for submit@debbugs.gnu.org; Thu, 21 Sep 2023 10:37:49 -0400 Original-Received: from out1-smtp.messagingengine.com ([66.111.4.25]:34597) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qjKoJ-00075a-4H for 66020@debbugs.gnu.org; Thu, 21 Sep 2023 10:37:47 -0400 Original-Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 9409D5C0180; Thu, 21 Sep 2023 10:37:27 -0400 (EDT) Original-Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Thu, 21 Sep 2023 10:37:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm1; t= 1695307047; x=1695393447; bh=SGAA9obZLcqxFDzFTRshOfwhArV4juGnY0h ntVoMqyE=; b=4M8xYXQOmo9fzfqxBvzoiVOx0CQIO49AnGfaInGAb0sZlHypFOK R62WYTSvZDr5X3j+kzLkBpfluEEafhZ4BHi9CnffbzOTpnJB+8f2rI8gRMXf6JMt tIrtl66VBHYGnATBCquLqMaRwNc4yPekdarFERI0ml9C7/ESyu0bAK5JsMeCeXlF n1c6AGIWjI0h3vjdv6MaeOoioarcUzMwz7t183n4V0Bq9XJTRcMsgmTdJ/1xYy49 +jcKo3JeWYAMKyxOU8ibqMb3BF1CmmnLvNEbCacVmjXtLj51KxePQXX/7CM/SEW0 fSeB2mjuhGApEDuXeEgd8o86KK9U9UEZQXQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1695307047; x=1695393447; bh=SGAA9obZLcqxFDzFTRshOfwhArV4juGnY0h ntVoMqyE=; b=p6uhDK7uhGL39RuSWIu+j7VRamjra2hM/fm5rHZ0CHTRs9oh7Ab Imq+12H62pjHLzbjoVFSs7DBggUanE2GosRdV0OcqBCtvu2k7vCx8D0t2g1g2tpA goLuyJoY5EodWJS+DV40tRFIPXH9pplk90iYkkigL+yHL7L3kEGvRyLtqLEIhWNi EiuwPc3TOO8iUjghABlY0f5Y2GB3Nl1W43z3PkPKslgfN8tN0hD3ULzOXdUYjAlc 3pDppjAIch3YzD38n7OSitT9GoVUsK2pxGzjzuU9zuTZVdiA0o+Cci3XAKIlEaVd 7O5Fwg1mcLJfxQce67NZ9ETIjtPxagcSjjA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudekiedgjeekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtfeejnecuhfhrohhmpeffmhhi thhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdruggvvheqnecuggftrfgrth htvghrnhepiefgteevheevveffheeltdeukeeiieekueefgedugfefgefhudelgfefveel vdevnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepug hmihhtrhihsehguhhtohhvrdguvghv X-ME-Proxy: Feedback-ID: i0e71465a:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 21 Sep 2023 10:37:25 -0400 (EDT) Content-Language: en-US In-Reply-To: <83led09dlk.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:270997 Archived-At: On 21/09/2023 10:42, Eli Zaretskii wrote: >> Date: Thu, 21 Sep 2023 03:57:43 +0300 >> Cc: 66020@debbugs.gnu.org >> From: Dmitry Gutov >> >> That leaves the question of what new value to use. 409600 is optimal for >> a large-output process but seems too much as default anyway (even if I >> have very little experimental proof for that hesitance: any help with >> that would be very welcome). > > How does the throughput depend on this value? If the dependence curve > plateaus at some lower value, we could use that lower value as a > "good-enough" default. Depends on what we're prepared to call a plateau. Strictly speaking, not really. But we have a "sweet spot": for the process in my original benchmark ('find' with lots of output) it seems to be around 1009600. Here's a table (numbers are different from before because they're results of (benchmark 5 ...) divided by 5, meaning GC is amortized: | 4096 | 0.78 | | 16368 | 0.69 | | 40960 | 0.65 | | 409600 | 0.59 | | 1009600 | 0.56 | | 2009600 | 0.64 | | 4009600 | 0.65 | The process's output length is 27244567 in this case. Still above the largest of the buffers in this example. Notably, only allocating the buffer once at the start of the process (experiment mentioned in the email to Stefan M.) doesn't change the dynamics: buffer lengths above ~1009600 make the performance worse. So there must be some negative factor associated with higher buffers. There is an obvious positive one: the longer the buffer, the longer we don't switch between processes, so that overhead is lower. We could look into improving that part specifically: for example, reading from the process multiple times into 'chars' right away while there is still pending output present (either looping inside read_process_output, or calling it in a loop in wait_reading_process_output, at least until the process' buffered output is exhausted). That could reduce reactivity, however (can we find out how much is already buffered in advance, and only loop until we exhaust that length?) >> I did some more experimenting, though. At a superficial glance, >> allocating the 'chars' buffer at the beginning of read_process_output is >> problematic because we could instead reuse a buffer for the whole >> duration of the process. I tried that (adding a new field to >> Lisp_Process and setting it in make_process), although I had to use a >> value produced by make_uninit_string: apparently simply storing a char* >> field inside a managed structure creates problems for the GC and early >> segfaults. Anyway, the result was slightly _slower_ than the status quo. >> >> So I read what 'alloca' does, and it looks hard to beat. But it's only >> used (as you of course know) when the value is <= MAX_ALLOCA, which is >> currently 16384. Perhaps an optimal default value shouldn't exceed this, >> even if it's hard to create a benchmark that shows a difference. With >> read-process-output-max set to 16384, my original benchmark gets about >> halfway to the optimal number. > > Which I think means we should stop worrying about the overhead of > malloc for this purpose, as it is fast enough, at least on GNU/Linux. Perhaps. If we're not too concerned about memory fragmentation (that's the only explanation I have for the table "session gets older" -- last one -- in a previous email with test-ls-output timings). >> And I think we should make the process "remember" the value at its >> creation either way (something touched on in bug#38561): in bug#55737 we >> added an fcntl call to make the larger values take effect. But this call >> is in create_process: so any subsequent increase to a large value of >> this var won't have effect. > > Why would the variable change after create_process? I'm afraid I > don't understand what issue you are trying to deal with here. Well, what could we lose by saving the value of read-process-output-max in create_process? Currently I suppose one could vary its value while a process is still running, to implement some adaptive behavior or whatnot. But that's already semi-broken because fcntl is called in create_process.