From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Karthik Chikmagalur Newsgroups: gmane.emacs.devel Subject: Re: Understanding filter function calls Date: Fri, 28 Jul 2023 14:42:47 -0700 Message-ID: <87cz0b1zmw.fsf@gmail.com> References: <87y1j5vp3y.fsf@gmail.com> <96ad3a89-a0bd-6f8a-6251-d3f2f201e4f7@vodafonemail.de> <87pm4d12rc.fsf@gmail.com> <835y64lhd8.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="19658"; mail-complaints-to="usenet@ciao.gmane.io" Cc: jschmidt4gnu@vodafonemail.de, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Jul 29 00:20:40 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qPVp9-0004yn-Uo for ged-emacs-devel@m.gmane-mx.org; Sat, 29 Jul 2023 00:20:40 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qPVEd-00027J-5k; Fri, 28 Jul 2023 17:42:55 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qPVEb-00026S-Ai for emacs-devel@gnu.org; Fri, 28 Jul 2023 17:42:53 -0400 Original-Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qPVEZ-0008NH-AV; Fri, 28 Jul 2023 17:42:53 -0400 Original-Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1bb8e45185bso16798475ad.1; Fri, 28 Jul 2023 14:42:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690580568; x=1691185368; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:from:to:cc:subject:date:message-id:reply-to; bh=tHPWe5zGvrZLGrR4n//zxnTgFxmX5BqKjie1uDJM2Z0=; b=dKc6EQge4CU+mR7oeMJfesQmO1IkcxohwPvguhUCgJrubuF+zXKAMWJlWyz0v3WgOC oOfqnJSB4CPc+RkszyxvnRj5Vnv//G0WCM5cn5vZbV2xR1aBBp9p+dpaGpb4ujpfr+H3 bpw5NTc+KoVpBPwNzd1z1sZC8MOAWBvslNCwANdwFLwHXwptdU0LjYAR4ROy1AIBaSsh jgvXMbBDbg12371Ph/ZotdCCJNlr2e3q0N//i3FWB8+lo/KKfeoRun1v08HPcXqcLXPc FTuiHQA4d/IpaSUPLmR9H8u68geqLqomerCn1NSvB9XM9kGITXDiq/r+SGg2fjphZ2m3 4OFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690580568; x=1691185368; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tHPWe5zGvrZLGrR4n//zxnTgFxmX5BqKjie1uDJM2Z0=; b=KO9LJGz+JRh5/Y2hji5+Do+nYz5wNB9r1Y2C261nLmIp5TbdGrAUNWflP0qng5DLTm 414To/pnR6KK5O5nDXejoHFqjwYVR1wfXom8f1HLPPutZooiUHO7/dmjd+GHly3iKAV5 whbkaHhuYeJJmLyjciQQy3ehu7S9ApkeuyBD8yF+tA7l8IwUqViMvGLnKMneouYYfFRX hGPLazeVTH/8JuZURPsCMM/oG/Sb2qDztzNRfdeXTNA29jCfigTPzvZ25j4xHGfHkRqs qAz304L1pB078nhjydZWphlRvLjyDOF9WsatidFAFcH+A2LSRIqa40UmsL7zeCIq0AxO RXqw== X-Gm-Message-State: ABy/qLaNZ5GHfMIfELgBNfcOGnqdBUHCA2s3CRP7IFywSFr7OOzOdayw okPh06siwwUElDeSwOcLAWvMy6WE9HE= X-Google-Smtp-Source: APBJJlGmzdsrLTXqgMLxMjXBOh6bSKYBW0tkEza+7L5q30ZTC5zFwZXDbAVw6BFUAxa2TIreZcd7AQ== X-Received: by 2002:a17:902:ce8f:b0:1b9:e591:db38 with SMTP id f15-20020a170902ce8f00b001b9e591db38mr2747376plg.8.1690580568501; Fri, 28 Jul 2023 14:42:48 -0700 (PDT) Original-Received: from localhost ([169.231.45.234]) by smtp.gmail.com with ESMTPSA id 21-20020a170902c21500b001b850c9af71sm4011838pll.285.2023.07.28.14.42.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Jul 2023 14:42:47 -0700 (PDT) In-Reply-To: <835y64lhd8.fsf@gnu.org> Received-SPF: pass client-ip=2607:f8b0:4864:20::62b; envelope-from=karthikchikmagalur@gmail.com; helo=mail-pl1-x62b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:308194 Archived-At: Thank you for the suggestions -- this problem is resolved! (See below) But I still have a few questions about Emacs' async subprocess API. (Included at the end) > I think it isn't that dvisvgm is waiting for Emacs, it's that the > whole process of reading output by Emacs and processing that output > takes longer. This is what I assumed at first too, but I made the filter function a noop and still observed this behavior. >> Using a larger stream buffer (if possible) should fix this issue. > > You already tried that, AFAIU, and it didn't help. OK, I didn't realize this is what `read-process-output-max' sets. It makes sense now. >> 2. Enlarge the buffer or "pipe" connecting dvisvgm to Emacs. This >> stream buffer appears to be set to 4KB. Since dvisvgm produces far more >> output (to stdout) than this between two successive instances of Emacs >> accepting process output, widening the pipe should relieve this >> pressure. I tried tweaking `read-process-output-max' but this doesn't >> help. > > Which probably means that each time we get to check for subprocess > output, there's less than 4KB of stuff in the pipe ready to be read? > Did you look at the amount of bytes we read each time? How many bytes > do we read, and does this number change if you change the value of > read-process-output-max? In the following description, - `process-adaptive-read-buffering' is set to t - `read-process-output-max' is set to 65,536 1. I logged the length of the string that is passed to the filter function on each invocation. On the first 3-4 calls, the length is variable, from 1 to 2000. In the remaining 30 calls, the length is almost always 4095. 2. There are rare exceptions to this, when the length jumps up to 20,000. This is usually for a single filter call out of ~35. 3. There is no change to the above behavior when I change `read-process-output-max', although I didn't set it below 4096. > Also, did you try setting process-adaptive-read-buffering to nil? Setting it to nil essentially fixes the problem! The filter function is now called 80+ times instead of 35 times, the string it's called with each time is of variable length, generally under 500 characters, but overall the process is much, much faster. Total preview time: | process-adaptive-read-buffering | t | nil | |---------------------------------+-----------+-----------| | TeXLive 2022 | 2.65 secs | 1.88 secs | | TeXLive 2023 | 4.03 secs | 1.80 secs | We get a huge speed-up, up to 120%. >> 3. Get dvisvgm to generate less verbose output. Unfortunately this is >> not configurable at the level of granularity that we need. We can't >> turn it off completely either since we rely on certain strings in the >> process stdout to update the LaTeX preview state in Org buffers. >> >> Any ideas on how to avoid this throttling would be appreciated. > > So once again a problem generated by an external program, which cannot > be configured by that program, is blamed on Emacs? Is this fair? Eli, I apologize for giving the impression that I'm blaming this on Emacs. Your phrasing ("once again") suggests that there's more to this than I'm aware of. My only goal here is to provide the most responsive user experience possible when writing LaTeX math in Org mode, under the constraint that this needs to be done by gluing together two (actually three) processes. I care about where the bottleneck is -- or was -- only so that I may address it. > My take of this is not that Emacs is "throttling" dvisvgm, but that > dvisvgm is "flooding" Emacs with useless data, and cannot be told to > shut up. This is a fair take, but as it turns out from the new benchmarks, the total preview time is the same for the two dvisvgm versions once `process-adaptive-read-buffering' is set to nil. > AFAICT, the data it outputs is completely useless in > production use cases, and (with the possible exception of the last > line, which shows the output file name) is basically debug-level info. The filter function needs to read the output file name for the purposes of updating LaTeX previews in the buffer. The rest is superfluous to us -- but it's possible that the sizing information is useful to other applications. > Did you try to take this problem up with the dvisvgm developers? I was planning to, but it looks like TeXLive 2023 is actually slightly faster now, so the extra stdout makes no difference. > And in any case, we are talking about 0.4 sec delay wrt the older > version of TeXLive, right? Is this really large enough to worry > about? The delay was about 40%, which could be 1+ second with lots of LaTeX fragments, as in my original benchmark. Moreover, even 200ms makes a noticeable difference when 1. opening Org files that have the latexpreview startup option set, 2. live-previewing equations with a quick feedback loop: See https://tinyurl.com/ms2ksthc 3. and when tweaking equation numbering etc, which can cause all subsequent LaTeX fragments in the buffer to have their previews regenerated. >> 1. Reduce the duration between successive calls of the filter function. >> Is this configurable in Emacs? I don't see anything relevant in the >> manual sections on accepting output from processes or filter functions. > > This is not configurable for the simple reason that Emacs checks for > subprocess output every time its main loop gets to that, so basically > Emacs does that as fast as possible, assuming it is idle, i.e. no > other command or timer runs. Because of `process-adaptive-read-buffering', I'm not sure this is all there is to it, because I wouldn't expect a difference in how long the preview generation run takes otherwise. Am I correct? The documentation for this variable states: "If non-nil, improve receive buffering by delaying after short reads." ---- This brings me to the question of what is going on. My initial assessment -- that dvisvgm is waiting on Emacs to clear the output stream buffer -- is wrong, since changing `read-process-output-max' doesn't change how long the run takes, or how much data the filter function is called with each time. 1. So why is the run faster with `process-adaptive-read-buffering' set to nil? I understand that this reduces read latency, but not why the process is so much faster overall. 2. What is the performance implication of setting `process-adaptive-read-buffering' to nil? The documentation for this variable mentions: > On some systems, when Emacs reads the output from a subprocess, the > output data is read in very small blocks, potentially resulting in very > poor performance. But I was able to use Emacs normally (typing, calling M-x etc) when the previews were being updated, which is great. So I'm not sure what the performance implications are here. Karthik