unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: jschmidt4gnu@vodafonemail.de, emacs-devel@gnu.org
Subject: Re: Understanding filter function calls
Date: Fri, 28 Jul 2023 14:42:47 -0700	[thread overview]
Message-ID: <87cz0b1zmw.fsf@gmail.com> (raw)
In-Reply-To: <835y64lhd8.fsf@gnu.org>

Thank you for the suggestions -- this problem is resolved! (See below)

But I still have a few questions about Emacs' async subprocess API.
(Included at the end)

> I think it isn't that dvisvgm is waiting for Emacs, it's that the
> whole process of reading output by Emacs and processing that output
> takes longer.

This is what I assumed at first too, but I made the filter function
a noop and still observed this behavior.

>> Using a larger stream buffer (if possible) should fix this issue.
> 
> You already tried that, AFAIU, and it didn't help.

OK, I didn't realize this is what `read-process-output-max' sets.  It
makes sense now.

>> 2.  Enlarge the buffer or "pipe" connecting dvisvgm to Emacs.  This
>> stream buffer appears to be set to 4KB.  Since dvisvgm produces far more
>> output (to stdout) than this between two successive instances of Emacs
>> accepting process output, widening the pipe should relieve this
>> pressure.  I tried tweaking `read-process-output-max' but this doesn't
>> help.
>
> Which probably means that each time we get to check for subprocess
> output, there's less than 4KB of stuff in the pipe ready to be read?
> Did you look at the amount of bytes we read each time?  How many bytes
> do we read, and does this number change if you change the value of
> read-process-output-max?

In the following description,
- `process-adaptive-read-buffering' is set to t
- `read-process-output-max' is set to 65,536

1.  I logged the length of the string that is passed to the filter
function on each invocation.  On the first 3-4 calls, the length is
variable, from 1 to 2000.  In the remaining 30 calls, the length is
almost always 4095. 

2. There are rare exceptions to this, when the length jumps up to
20,000.  This is usually for a single filter call out of ~35.

3. There is no change to the above behavior when I change
`read-process-output-max', although I didn't set it below 4096.

> Also, did you try setting process-adaptive-read-buffering to nil?

Setting it to nil essentially fixes the problem!  The filter function is
now called 80+ times instead of 35 times, the string it's called with
each time is of variable length, generally under 500 characters, but
overall the process is much, much faster.

Total preview time:
| process-adaptive-read-buffering | t         | nil       |
|---------------------------------+-----------+-----------|
| TeXLive 2022                    | 2.65 secs | 1.88 secs |
| TeXLive 2023                    | 4.03 secs | 1.80 secs |

We get a huge speed-up, up to 120%.

>> 3.  Get dvisvgm to generate less verbose output.  Unfortunately this is
>> not configurable at the level of granularity that we need.  We can't
>> turn it off completely either since we rely on certain strings in the
>> process stdout to update the LaTeX preview state in Org buffers.
>> 
>> Any ideas on how to avoid this throttling would be appreciated.
>
> So once again a problem generated by an external program, which cannot
> be configured by that program, is blamed on Emacs?  Is this fair?

Eli, I apologize for giving the impression that I'm blaming this on
Emacs.  Your phrasing ("once again") suggests that there's more to this
than I'm aware of.  My only goal here is to provide the most responsive
user experience possible when writing LaTeX math in Org mode, under the
constraint that this needs to be done by gluing together two (actually
three) processes.  I care about where the bottleneck is -- or was --
only so that I may address it.

> My take of this is not that Emacs is "throttling" dvisvgm, but that
> dvisvgm is "flooding" Emacs with useless data, and cannot be told to
> shut up.

This is a fair take, but as it turns out from the new benchmarks, the
total preview time is the same for the two dvisvgm versions once
`process-adaptive-read-buffering' is set to nil.

> AFAICT, the data it outputs is completely useless in
> production use cases, and (with the possible exception of the last
> line, which shows the output file name) is basically debug-level info.

The filter function needs to read the output file name for the purposes
of updating LaTeX previews in the buffer.  The rest is superfluous to us
-- but it's possible that the sizing information is useful to other
applications.

> Did you try to take this problem up with the dvisvgm developers?

I was planning to, but it looks like TeXLive 2023 is actually slightly
faster now, so the extra stdout makes no difference.

> And in any case, we are talking about 0.4 sec delay wrt the older
> version of TeXLive, right?  Is this really large enough to worry
> about?

The delay was about 40%, which could be 1+ second with lots of LaTeX
fragments, as in my original benchmark.  Moreover, even 200ms makes a
noticeable difference when

1. opening Org files that have the latexpreview startup option set,

2. live-previewing equations with a quick feedback loop: See
https://tinyurl.com/ms2ksthc

3. and when tweaking equation numbering etc, which can cause all
subsequent LaTeX fragments in the buffer to have their previews
regenerated.

>> 1.  Reduce the duration between successive calls of the filter function.
>> Is this configurable in Emacs?  I don't see anything relevant in the
>> manual sections on accepting output from processes or filter functions.
>
> This is not configurable for the simple reason that Emacs checks for
> subprocess output every time its main loop gets to that, so basically
> Emacs does that as fast as possible, assuming it is idle, i.e. no
> other command or timer runs.

Because of `process-adaptive-read-buffering', I'm not sure this is all
there is to it, because I wouldn't expect a difference in how long the
preview generation run takes otherwise.  Am I correct?  The
documentation for this variable states:

"If non-nil, improve receive buffering by delaying after short reads."

----

This brings me to the question of what is going on.  My initial
assessment -- that dvisvgm is waiting on Emacs to clear the output
stream buffer -- is wrong, since changing `read-process-output-max'
doesn't change how long the run takes, or how much data the filter
function is called with each time.

1. So why is the run faster with `process-adaptive-read-buffering' set
to nil?  I understand that this reduces read latency, but not why the
process is so much faster overall.

2. What is the performance implication of setting
`process-adaptive-read-buffering' to nil?

The documentation for this variable mentions:

> On some systems, when Emacs reads the output from a subprocess, the
> output data is read in very small blocks, potentially resulting in very
> poor performance.

But I was able to use Emacs normally (typing, calling M-x etc) when the
previews were being updated, which is great.  So I'm not sure what the
performance implications are here.

Karthik



  reply	other threads:[~2023-07-28 21:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-24  5:46 Understanding filter function calls Karthik Chikmagalur
2023-07-24 20:57 ` Jens Schmidt
2023-07-27 21:08   ` Karthik Chikmagalur
2023-07-27 21:44     ` Karthik Chikmagalur
2023-07-28  5:47       ` Eli Zaretskii
2023-07-28  5:44     ` Eli Zaretskii
2023-07-28 21:42       ` Karthik Chikmagalur [this message]
2023-07-29  6:02         ` Eli Zaretskii
2023-07-29 22:16           ` Karthik Chikmagalur
2023-07-30  5:14             ` Eli Zaretskii
2023-07-28  7:54     ` Ihor Radchenko
2023-07-28 21:51       ` Karthik Chikmagalur
2023-07-29  6:04         ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2024-04-18  3:52 Karthik Chikmagalur

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87cz0b1zmw.fsf@gmail.com \
    --to=karthikchikmagalur@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=jschmidt4gnu@vodafonemail.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).