From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Eli Zaretskii <eliz@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Understanding filter function calls
Date: Sat, 29 Jul 2023 09:02:59 +0300
Message-ID: <83sf97i7ak.fsf@gnu.org>
References: <87y1j5vp3y.fsf@gmail.com>
 <96ad3a89-a0bd-6f8a-6251-d3f2f201e4f7@vodafonemail.de>
 <87pm4d12rc.fsf@gmail.com> <835y64lhd8.fsf@gnu.org> <87cz0b1zmw.fsf@gmail.com>
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="23469"; mail-complaints-to="usenet@ciao.gmane.io"
Cc: jschmidt4gnu@vodafonemail.de, emacs-devel@gnu.org
To: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Jul 29 08:04:23 2023
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1qPd3v-0005pH-6R
	for ged-emacs-devel@m.gmane-mx.org; Sat, 29 Jul 2023 08:04:23 +0200
Original-Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces@gnu.org>)
	id 1qPd1m-0008OS-6o; Sat, 29 Jul 2023 02:02:10 -0400
Original-Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@gnu.org>) id 1qPd1k-0008OA-Jg
 for emacs-devel@gnu.org; Sat, 29 Jul 2023 02:02:08 -0400
Original-Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@gnu.org>)
 id 1qPd1j-0000ei-Ak; Sat, 29 Jul 2023 02:02:07 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=YI6FT4IGnpZdZ26W8dzLAPNl17Ek4apBGCu9Wc5K4Vk=; b=NdsX8N9hxcQM
 LmeFj/od7t2OOv0BCIen4kH8TyVCeN4/opkgvdL21/5/YN/O/Plu+0nhIEWsEIU2GjI2HuGVqEC9o
 mXFTnPuew3CBVJ4q7RwNnV6J7I4+Gy2bhBBkJguORHTGrXIpobg2GaKURxFE22egz8xwPXSnJa2/W
 qrvH7vl6ZbJJQlZyjruyr+ZduXhfmw42Zj+ZwU8wFC6aVYSAmDJ6aBX3XCYUTPr7YAXL8AGYJ3ZxY
 bC6vQPeZisFDlRNXpHeaRC1ksntpl0mVnFcLfUGMNlx+E4hblS3WoNAy6C64tV1XD4butxtYJzpSe
 EUGxa2vAd30dHP/VwAFU4Q==;
Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@gnu.org>)
 id 1qPd1i-0004iD-Ih; Sat, 29 Jul 2023 02:02:07 -0400
In-Reply-To: <87cz0b1zmw.fsf@gmail.com> (message from Karthik Chikmagalur on
 Fri, 28 Jul 2023 14:42:47 -0700)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Xref: news.gmane.io gmane.emacs.devel:308197
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/308197>

> From: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
> Cc: jschmidt4gnu@vodafonemail.de, emacs-devel@gnu.org
> Date: Fri, 28 Jul 2023 14:42:47 -0700
> 
> > I think it isn't that dvisvgm is waiting for Emacs, it's that the
> > whole process of reading output by Emacs and processing that output
> > takes longer.
> 
> This is what I assumed at first too, but I made the filter function
> a noop and still observed this behavior.

Setting the filter function to a no-op doesn't prevent Emacs from
processing the subprocess output.  Emacs still has to read all of the
output, and with the later TeXLive that output is larger, so it needs
more time to read it.

> >> 2.  Enlarge the buffer or "pipe" connecting dvisvgm to Emacs.  This
> >> stream buffer appears to be set to 4KB.  Since dvisvgm produces far more
> >> output (to stdout) than this between two successive instances of Emacs
> >> accepting process output, widening the pipe should relieve this
> >> pressure.  I tried tweaking `read-process-output-max' but this doesn't
> >> help.
> >
> > Which probably means that each time we get to check for subprocess
> > output, there's less than 4KB of stuff in the pipe ready to be read?
> > Did you look at the amount of bytes we read each time?  How many bytes
> > do we read, and does this number change if you change the value of
> > read-process-output-max?
> 
> In the following description,
> - `process-adaptive-read-buffering' is set to t
> - `read-process-output-max' is set to 65,536
> 
> 1.  I logged the length of the string that is passed to the filter
> function on each invocation.  On the first 3-4 calls, the length is
> variable, from 1 to 2000.  In the remaining 30 calls, the length is
> almost always 4095. 
> 
> 2. There are rare exceptions to this, when the length jumps up to
> 20,000.  This is usually for a single filter call out of ~35.
> 
> 3. There is no change to the above behavior when I change
> `read-process-output-max', although I didn't set it below 4096.

Is this with process-connection-type set to nil or non-nil?  If you
didn't try setting it to nil, please do, and see if the behavior
changes.

> > Also, did you try setting process-adaptive-read-buffering to nil?
> 
> Setting it to nil essentially fixes the problem!  The filter function is
> now called 80+ times instead of 35 times, the string it's called with
> each time is of variable length, generally under 500 characters, but
> overall the process is much, much faster.
> 
> Total preview time:
> | process-adaptive-read-buffering | t         | nil       |
> |---------------------------------+-----------+-----------|
> | TeXLive 2022                    | 2.65 secs | 1.88 secs |
> | TeXLive 2023                    | 4.03 secs | 1.80 secs |
> 
> We get a huge speed-up, up to 120%.

Then I guess you need to bind process-adaptive-read-buffering to nil
when you perform this processing.  (But please also see what happens
if process-connection-type is set to nil, thus using pipes instead of
PTY to read subprocess output: it could change the picture.)

> > So once again a problem generated by an external program, which cannot
> > be configured by that program, is blamed on Emacs?  Is this fair?
> 
> Eli, I apologize for giving the impression that I'm blaming this on
> Emacs.  Your phrasing ("once again") suggests that there's more to this
> than I'm aware of.

Emacs is very configurable, so people tend to expect us to solve
problems we didn't cause.

> My only goal here is to provide the most responsive user experience
> possible when writing LaTeX math in Org mode, under the constraint
> that this needs to be done by gluing together two (actually three)
> processes.  I care about where the bottleneck is -- or was -- only
> so that I may address it.

I understand, but clearly the change responsible for this was in
dvisvgm, not in Emacs, so expecting Emacs to rectify it is not really
fair.

> > My take of this is not that Emacs is "throttling" dvisvgm, but that
> > dvisvgm is "flooding" Emacs with useless data, and cannot be told to
> > shut up.
> 
> This is a fair take, but as it turns out from the new benchmarks, the
> total preview time is the same for the two dvisvgm versions once
> `process-adaptive-read-buffering' is set to nil.

So you were lucky to find a solution by tweaking Emacs this time.
That's good for you, but not every problem with some external program
can be fixed like that.

> > AFAICT, the data it outputs is completely useless in
> > production use cases, and (with the possible exception of the last
> > line, which shows the output file name) is basically debug-level info.
> 
> The filter function needs to read the output file name for the purposes
> of updating LaTeX previews in the buffer.  The rest is superfluous to us
> -- but it's possible that the sizing information is useful to other
> applications.

Then maybe ask the dvisvgm developers to provide the level of
verbosity that only shows the file name.  Emacs in general and Org in
particular are important applications, and so it would be reasonable
for the dvisvgm developers to cater to our needs, not only to the
needs of other programs.

> > Did you try to take this problem up with the dvisvgm developers?
> 
> I was planning to, but it looks like TeXLive 2023 is actually slightly
> faster now, so the extra stdout makes no difference.

Well, it definitely makes the difference to Emacs!  Other Emacs
features which may need to use dvisvgm might not be able to use the
solution you found, for whatever reasons, so being able to make
dvisvgm be as silent as Emacs needs is a Good Thing.

> > And in any case, we are talking about 0.4 sec delay wrt the older
> > version of TeXLive, right?  Is this really large enough to worry
> > about?
> 
> The delay was about 40%, which could be 1+ second with lots of LaTeX
> fragments, as in my original benchmark.  Moreover, even 200ms makes a
> noticeable difference when
> 
> 1. opening Org files that have the latexpreview startup option set,
> 
> 2. live-previewing equations with a quick feedback loop: See
> https://tinyurl.com/ms2ksthc
> 
> 3. and when tweaking equation numbering etc, which can cause all
> subsequent LaTeX fragments in the buffer to have their previews
> regenerated.

I very much doubt that 0.2 sec makes a difference when the overall
time is about 4 sec.

> >> 1.  Reduce the duration between successive calls of the filter function.
> >> Is this configurable in Emacs?  I don't see anything relevant in the
> >> manual sections on accepting output from processes or filter functions.
> >
> > This is not configurable for the simple reason that Emacs checks for
> > subprocess output every time its main loop gets to that, so basically
> > Emacs does that as fast as possible, assuming it is idle, i.e. no
> > other command or timer runs.
> 
> Because of `process-adaptive-read-buffering', I'm not sure this is all
> there is to it, because I wouldn't expect a difference in how long the
> preview generation run takes otherwise.  Am I correct?  The
> documentation for this variable states:
> 
> "If non-nil, improve receive buffering by delaying after short reads."

I didn't assume dvisvgm will start by sending very small chunks of
text, so I didn't think the delays introduced by
process-adaptive-read-buffering would matter in your case.  In any
case, Emacs _checks_ for subprocess output as frequently as possible,
it just sometimes refrains from reading it (and thus refrains from
calling the filter), if the adaptive-reading feature requires that.

> 1. So why is the run faster with `process-adaptive-read-buffering' set
> to nil?  I understand that this reduces read latency, but not why the
> process is so much faster overall.

Because we don't delay reading subprocess output, but instead process
it as it comes in, regardless of the size.

> 2. What is the performance implication of setting
> `process-adaptive-read-buffering' to nil?

Our long-time experience is that the default non-nil value is better
overall.  So my suggestion is to change its value if you have a use
case where that is beneficial, but otherwise leave the default alone.

> The documentation for this variable mentions:
> 
> > On some systems, when Emacs reads the output from a subprocess, the
> > output data is read in very small blocks, potentially resulting in very
> > poor performance.
> 
> But I was able to use Emacs normally (typing, calling M-x etc) when the
> previews were being updated, which is great.  So I'm not sure what the
> performance implications are here.

You don't want to rewind the long history of this and re-live all the
issues we experienced and investigated before we settled on the
current default.  If you can solve this particular problem by binding
the variable to nil around the code which invokes dvisvgm, that is all
I suggest doing.