Understanding filter function calls

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* Understanding filter function calls
@ 2023-07-24  5:46 Karthik Chikmagalur
  2023-07-24 20:57 ` Jens Schmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Karthik Chikmagalur @ 2023-07-24  5:46 UTC (permalink / raw)
  To: emacs-devel

I'm testing some code written for the upcoming Org 9.7 for previewing
LaTeX fragments.  Given a LaTeX environment (or many), the way the code
works is:

1. Gather fragments in the buffer

2. Create a TeX file containing this LaTeX

3. Run latex (TeX -> DVI)

4. Run dvisvgm in the LaTeX process sentinel
  (DVI -> SVG or series of SVGs)

5. Update in-buffer previews as SVGs are generated through the dvisvgm
process' filter function.

We use a filter function for the dvisvgm process to incrementally update
previews as this is much faster on larger runs than waiting for the
process sentinel to run.

This is different from how LaTeX preview generation has worked in Org
mode so far.  It's as asynchronous as can be, and somewhat similar to
how preview-latex (part of AucTeX works), if you're familiar with that.

The problem is that how long preview generation takes is significantly
different for different TeXLive versions (i.e. different LaTeX/dvisvgm
executables).  For example, LaTeX preview generation in an Org file with
~600 fragments takes:

|              | Preview generation time |
|--------------+-------------------------|
| TeXLive 2022 | 2.65 secs               |
| TeXLive 2023 | 4.03 secs               |

This is with identical code on the Emacs side of things.

This difference is NOT explainable as the newer versions of LaTeX or
dvisvgm taking longer.  When benchmarked individually on the same TeX
file -- and without Emacs in the picture -- latex 2022 and latex 2023
(as I'll call them here) take near identical times, as do dvisvgm 2022
and dvisvgm 2023.

|              | latex run       | dvisvgm run  |
|--------------+-----------------+--------------|
| TeXLive 2022 | 253.9 ± 10.6 ms | 1266 ± 41 ms |
| TeXLive 2023 | 258.9 ± 15.0 ms | 1298 ± 15 ms |

The stdout from latex and dvisvgm, which the sentinel/filter functions
parse, are near identical, and the SVG images are the same sizes.  I've
controlled every variable I could think to control.

- Same Org file.
- Same org-latex-preview customizations/settings.
- Same Emacs buffers open, etc.
- Run `garbage-collect' immediately before benchmarking.
- Same background system processes.

So why is the TeXLive 2023 run so much slower in Emacs?

After profiling with elp and generating a flamegraph, this is the result
(png image): https://abode.karthinks.com/share/olp-timing-chart.png

You can ignore the disproportionately long duration function calls,
this is related to GC, and one additional GC phase during the slower
(TeXLive 2023) case cannot explain the discrepancy.  Further, sometimes
there are more GC phases in the TeXLive 2022 run, but it's still
significantly faster.

The overall synchronous run times of the code in Emacs for TeXLive 2022
and 2023 are similar (0.77 vs 0.84 s). However the dvisvgm filter
function is called quite differently in the two cases.

|              | call count | total time | average time |
|--------------+------------+------------+--------------|
| TeXLive 2022 |         25 | 0.77 secs  | 31 ms        |
| TeXLive 2023 |         39 | 0.84 secs  | 22 ms        |

Even though the overall time spent in the filter function is about the
same, the TeXLive 2023 dvisvgm run

- calls the filter function 39 times instead of 25, where
- each run of the filter function processes less stdout,
- each run of the filter function takes less time to run,
- and crucially, these filter function calls are spaced slightly further
  apart in time

The net result of which is that the overall preview generation process
takes much longer to complete.

I've tested this multiple times over runs (Org/TeX files) of different
sizes, and the pattern is the same.

My questions are thus:

1. If the latex/dvisvgm executables from TeXLive 2022/2023 take about
the same time, and the stdout (that the filter function sees) is
identical, why is Emacs' filter function call behavior different?

2. Is there anything I can do to obtain time signature/behavior like
with TeXLive 2022 (see above link)?  I would really like to speed up
preview generation.

-Karthik

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-24  5:46 Understanding filter function calls Karthik Chikmagalur
@ 2023-07-24 20:57 ` Jens Schmidt
  2023-07-27 21:08   ` Karthik Chikmagalur
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Schmidt @ 2023-07-24 20:57 UTC (permalink / raw)
  To: Karthik Chikmagalur, emacs-devel

On 2023-07-24  07:46, Karthik Chikmagalur wrote:

> 1. Gather fragments in the buffer
> 
> 2. Create a TeX file containing this LaTeX
> 
> 3. Run latex (TeX -> DVI)
> 
> 4. Run dvisvgm in the LaTeX process sentinel (DVI -> SVG or series
> of SVGs)
> 
> 5. Update in-buffer previews as SVGs are generated through the 
> dvisvgm process' filter function.

Most of the above are unknown to me but I'll risk a guess anyway ...

> 1. If the latex/dvisvgm executables from TeXLive 2022/2023 take about
> the same time, and the stdout (that the filter function sees) is
> identical, why is Emacs' filter function call behavior different?

... seems that TeXLive 2023 creates differently sized chunks of output?
Could you try to buffer these using, for example, "dd", to see whether
that makes a difference?

More concretely, this snippet

   ( while :; do echo 1; sleep 0.5; done ) | dd iflag=fullblock bs=16

spits out 8 lines of output every 4 secs.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-24 20:57 ` Jens Schmidt
@ 2023-07-27 21:08   ` Karthik Chikmagalur
  2023-07-27 21:44     ` Karthik Chikmagalur
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Karthik Chikmagalur @ 2023-07-27 21:08 UTC (permalink / raw)
  To: Jens Schmidt, emacs-devel

> ... seems that TeXLive 2023 creates differently sized chunks of output?
> Could you try to buffer these using, for example, "dd", to see whether
> that makes a difference?

Thank you for the suggestion.  The problem appears to be simpler than
varying chunk sizes.  I had a closer look at the stdout from the two
versions of dvisvgm and it turns out they are not, in fact, identical.

On a large Org file, TeXLive 2023's dvisvgm stdout is 2521 lines to
TeXLive 2022's 1520.  This difference is explainable by some extra
information that dvisvgm prints per generated SVG image in the former
case:

TeXLive 2023:
-------8<--------
processing page 2
  computing extents based on data set by preview package (version 13.1)
  width=5.520849pt, height=6.944447pt, depth=0pt
  graphic size: 5.520849pt x 6.944447pt (1.940357mm x 2.440694mm)
  output written to test-000000002.svg
-------8<--------

TeXLive 2022:
-------8<--------
processing page 2
  graphic size: 18.524351pt x 10pt (6.510565mm x 3.514598mm)
  output written to test-000000002.svg
-------8<--------

Multiply the above blocks by 600, and that explains the large difference
in the sizes of the stdout.

It does not take any longer for the filter function to run on each
invocation.  The problem is that The filter in Emacs is called with
about 4KB of new text every time, and these calls are spaced out evenly
in time, so Emacs takes longer to get around to parsing the stdout with
TeXLive 2023.

The dvisvgm process is thus bottlenecked by Emacs.  I can imagine a few
different ways to fix this, but I don't know how to do them:

1.  Reduce the duration between successive calls of the filter function.
Is this configurable in Emacs?  I don't see anything relevant in the
manual sections on accepting output from processes or filter functions.

2.  Enlarge the buffer or "pipe" connecting dvisvgm to Emacs.  This
stream buffer appears to be set to 4KB.  Since dvisvgm produces far more
output (to stdout) than this between two successive instances of Emacs
accepting process output, widening the pipe should relieve this
pressure.  I tried tweaking `read-process-output-max' but this doesn't
help.

3.  Get dvisvgm to generate less verbose output.  Unfortunately this is
not configurable at the level of granularity that we need.  We can't
turn it off completely either since we rely on certain strings in the
process stdout to update the LaTeX preview state in Org buffers.

Any ideas on how to avoid this throttling would be appreciated.

Karthik

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-27 21:08   ` Karthik Chikmagalur
@ 2023-07-27 21:44     ` Karthik Chikmagalur
  2023-07-28  5:47       ` Eli Zaretskii
  2023-07-28  5:44     ` Eli Zaretskii
  2023-07-28  7:54     ` Ihor Radchenko
  2 siblings, 1 reply; 14+ messages in thread
From: Karthik Chikmagalur @ 2023-07-27 21:44 UTC (permalink / raw)
  To: Jens Schmidt, emacs-devel

> It does not take any longer for the filter function to run on each
> invocation.  The problem is that The filter in Emacs is called with
> about 4KB of new text every time, and these calls are spaced out evenly
> in time, so Emacs takes longer to get around to parsing the stdout with
> TeXLive 2023.
>
> The dvisvgm process is thus bottlenecked by Emacs.  I can imagine a few
> different ways to fix this, but I don't know how to do them:

I can confirm this after some benchmarking:

| dvisvgm version | Run in shell (s) | Run via Emacs (s) | Difference |
|-----------------+------------------+-------------------+------------|
| TeXLive 2022    |            1.427 |             1.786 |        25% |
| TeXLive 2023    |            1.481 |             2.202 |        48% |

The variances are small and have been omitted.  So TeXLive 2023's
dvisvgm is about 48% slower when called from Emacs because it's waiting
on Emacs to pick up the process output.  Using a larger stream buffer
(if possible) should fix this issue.

Karthik





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-27 21:08   ` Karthik Chikmagalur
  2023-07-27 21:44     ` Karthik Chikmagalur
@ 2023-07-28  5:44     ` Eli Zaretskii
  2023-07-28 21:42       ` Karthik Chikmagalur
  2023-07-28  7:54     ` Ihor Radchenko
  2 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2023-07-28  5:44 UTC (permalink / raw)
  To: Karthik Chikmagalur; +Cc: jschmidt4gnu, emacs-devel

> From: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
> Date: Thu, 27 Jul 2023 14:08:23 -0700
> 
> Multiply the above blocks by 600, and that explains the large difference
> in the sizes of the stdout.
> 
> It does not take any longer for the filter function to run on each
> invocation.  The problem is that The filter in Emacs is called with
> about 4KB of new text every time, and these calls are spaced out evenly
> in time, so Emacs takes longer to get around to parsing the stdout with
> TeXLive 2023.
> 
> The dvisvgm process is thus bottlenecked by Emacs.  I can imagine a few
> different ways to fix this, but I don't know how to do them:
> 
> 1.  Reduce the duration between successive calls of the filter function.
> Is this configurable in Emacs?  I don't see anything relevant in the
> manual sections on accepting output from processes or filter functions.

This is not configurable for the simple reason that Emacs checks for
subprocess output every time its main loop gets to that, so basically
Emacs does that as fast as possible, assuming it is idle, i.e. no
other command or timer runs.

> 2.  Enlarge the buffer or "pipe" connecting dvisvgm to Emacs.  This
> stream buffer appears to be set to 4KB.  Since dvisvgm produces far more
> output (to stdout) than this between two successive instances of Emacs
> accepting process output, widening the pipe should relieve this
> pressure.  I tried tweaking `read-process-output-max' but this doesn't
> help.

Which probably means that each time we get to check for subprocess
output, there's less than 4KB of stuff in the pipe ready to be read?
Did you look at the amount of bytes we read each time?  How many bytes
do we read, and does this number change if you change the value of
read-process-output-max?

Also, did you try setting process-adaptive-read-buffering to nil?

> 3.  Get dvisvgm to generate less verbose output.  Unfortunately this is
> not configurable at the level of granularity that we need.  We can't
> turn it off completely either since we rely on certain strings in the
> process stdout to update the LaTeX preview state in Org buffers.
> 
> Any ideas on how to avoid this throttling would be appreciated.

So once again a problem generated by an external program, which cannot
be configured by that program, is blamed on Emacs?  Is this fair?  My
take of this is not that Emacs is "throttling" dvisvgm, but that
dvisvgm is "flooding" Emacs with useless data, and cannot be told to
shut up.  AFAICT, the data it outputs is completely useless in
production use cases, and (with the possible exception of the last
line, which shows the output file name) is basically debug-level info.
Did you try to take this problem up with the dvisvgm developers?

And in any case, we are talking about 0.4 sec delay wrt the older
version of TeXLive, right?  Is this really large enough to worry
about?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-27 21:44     ` Karthik Chikmagalur
@ 2023-07-28  5:47       ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2023-07-28  5:47 UTC (permalink / raw)
  To: Karthik Chikmagalur; +Cc: jschmidt4gnu, emacs-devel

> From: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
> Date: Thu, 27 Jul 2023 14:44:21 -0700
> 
> | dvisvgm version | Run in shell (s) | Run via Emacs (s) | Difference |
> |-----------------+------------------+-------------------+------------|
> | TeXLive 2022    |            1.427 |             1.786 |        25% |
> | TeXLive 2023    |            1.481 |             2.202 |        48% |
> 
> The variances are small and have been omitted.  So TeXLive 2023's
> dvisvgm is about 48% slower when called from Emacs because it's waiting
> on Emacs to pick up the process output.

I think it isn't that dvisvgm is waiting for Emacs, it's that the
whole process of reading output by Emacs and processing that output
takes longer.

> Using a larger stream buffer (if possible) should fix this issue.

You already tried that, AFAIU, and it didn't help.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-27 21:08   ` Karthik Chikmagalur
  2023-07-27 21:44     ` Karthik Chikmagalur
  2023-07-28  5:44     ` Eli Zaretskii
@ 2023-07-28  7:54     ` Ihor Radchenko
  2023-07-28 21:51       ` Karthik Chikmagalur
  2 siblings, 1 reply; 14+ messages in thread
From: Ihor Radchenko @ 2023-07-28  7:54 UTC (permalink / raw)
  To: Karthik Chikmagalur; +Cc: Jens Schmidt, emacs-devel

Karthik Chikmagalur <karthikchikmagalur@gmail.com> writes:

> 3.  Get dvisvgm to generate less verbose output.  Unfortunately this is
> not configurable at the level of granularity that we need.  We can't
> turn it off completely either since we rely on certain strings in the
> process stdout to update the LaTeX preview state in Org buffers.

Do you mean that

 -v, --verbosity=level
           Controls the type of messages printed during a dvisvgm run:
           0   no message output at all
           1   error messages only
           2   warning messages only
           4   informational messages only

               Note
               By adding these values you can combine the categories. The default level is 7, i.e. all messages are printed.

is not sufficient?

> Any ideas on how to avoid this throttling would be appreciated.

One possibility can be redirecting the strerr to file, waiting for the
dvisvgm process to finish, and then examining the file.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-28  5:44     ` Eli Zaretskii
@ 2023-07-28 21:42       ` Karthik Chikmagalur
  2023-07-29  6:02         ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Karthik Chikmagalur @ 2023-07-28 21:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jschmidt4gnu, emacs-devel

Thank you for the suggestions -- this problem is resolved! (See below)

But I still have a few questions about Emacs' async subprocess API.
(Included at the end)

> I think it isn't that dvisvgm is waiting for Emacs, it's that the
> whole process of reading output by Emacs and processing that output
> takes longer.

This is what I assumed at first too, but I made the filter function
a noop and still observed this behavior.

>> Using a larger stream buffer (if possible) should fix this issue.
> 
> You already tried that, AFAIU, and it didn't help.

OK, I didn't realize this is what `read-process-output-max' sets.  It
makes sense now.

>> 2.  Enlarge the buffer or "pipe" connecting dvisvgm to Emacs.  This
>> stream buffer appears to be set to 4KB.  Since dvisvgm produces far more
>> output (to stdout) than this between two successive instances of Emacs
>> accepting process output, widening the pipe should relieve this
>> pressure.  I tried tweaking `read-process-output-max' but this doesn't
>> help.
>
> Which probably means that each time we get to check for subprocess
> output, there's less than 4KB of stuff in the pipe ready to be read?
> Did you look at the amount of bytes we read each time?  How many bytes
> do we read, and does this number change if you change the value of
> read-process-output-max?

In the following description,
- `process-adaptive-read-buffering' is set to t
- `read-process-output-max' is set to 65,536

1.  I logged the length of the string that is passed to the filter
function on each invocation.  On the first 3-4 calls, the length is
variable, from 1 to 2000.  In the remaining 30 calls, the length is
almost always 4095. 

2. There are rare exceptions to this, when the length jumps up to
20,000.  This is usually for a single filter call out of ~35.

3. There is no change to the above behavior when I change
`read-process-output-max', although I didn't set it below 4096.

> Also, did you try setting process-adaptive-read-buffering to nil?

Setting it to nil essentially fixes the problem!  The filter function is
now called 80+ times instead of 35 times, the string it's called with
each time is of variable length, generally under 500 characters, but
overall the process is much, much faster.

Total preview time:
| process-adaptive-read-buffering | t         | nil       |
|---------------------------------+-----------+-----------|
| TeXLive 2022                    | 2.65 secs | 1.88 secs |
| TeXLive 2023                    | 4.03 secs | 1.80 secs |

We get a huge speed-up, up to 120%.

>> 3.  Get dvisvgm to generate less verbose output.  Unfortunately this is
>> not configurable at the level of granularity that we need.  We can't
>> turn it off completely either since we rely on certain strings in the
>> process stdout to update the LaTeX preview state in Org buffers.
>> 
>> Any ideas on how to avoid this throttling would be appreciated.
>
> So once again a problem generated by an external program, which cannot
> be configured by that program, is blamed on Emacs?  Is this fair?

Eli, I apologize for giving the impression that I'm blaming this on
Emacs.  Your phrasing ("once again") suggests that there's more to this
than I'm aware of.  My only goal here is to provide the most responsive
user experience possible when writing LaTeX math in Org mode, under the
constraint that this needs to be done by gluing together two (actually
three) processes.  I care about where the bottleneck is -- or was --
only so that I may address it.

> My take of this is not that Emacs is "throttling" dvisvgm, but that
> dvisvgm is "flooding" Emacs with useless data, and cannot be told to
> shut up.

This is a fair take, but as it turns out from the new benchmarks, the
total preview time is the same for the two dvisvgm versions once
`process-adaptive-read-buffering' is set to nil.

> AFAICT, the data it outputs is completely useless in
> production use cases, and (with the possible exception of the last
> line, which shows the output file name) is basically debug-level info.

The filter function needs to read the output file name for the purposes
of updating LaTeX previews in the buffer.  The rest is superfluous to us
-- but it's possible that the sizing information is useful to other
applications.

> Did you try to take this problem up with the dvisvgm developers?

I was planning to, but it looks like TeXLive 2023 is actually slightly
faster now, so the extra stdout makes no difference.

> And in any case, we are talking about 0.4 sec delay wrt the older
> version of TeXLive, right?  Is this really large enough to worry
> about?

The delay was about 40%, which could be 1+ second with lots of LaTeX
fragments, as in my original benchmark.  Moreover, even 200ms makes a
noticeable difference when

1. opening Org files that have the latexpreview startup option set,

2. live-previewing equations with a quick feedback loop: See
https://tinyurl.com/ms2ksthc

3. and when tweaking equation numbering etc, which can cause all
subsequent LaTeX fragments in the buffer to have their previews
regenerated.

>> 1.  Reduce the duration between successive calls of the filter function.
>> Is this configurable in Emacs?  I don't see anything relevant in the
>> manual sections on accepting output from processes or filter functions.
>
> This is not configurable for the simple reason that Emacs checks for
> subprocess output every time its main loop gets to that, so basically
> Emacs does that as fast as possible, assuming it is idle, i.e. no
> other command or timer runs.

Because of `process-adaptive-read-buffering', I'm not sure this is all
there is to it, because I wouldn't expect a difference in how long the
preview generation run takes otherwise.  Am I correct?  The
documentation for this variable states:

"If non-nil, improve receive buffering by delaying after short reads."

----

This brings me to the question of what is going on.  My initial
assessment -- that dvisvgm is waiting on Emacs to clear the output
stream buffer -- is wrong, since changing `read-process-output-max'
doesn't change how long the run takes, or how much data the filter
function is called with each time.

1. So why is the run faster with `process-adaptive-read-buffering' set
to nil?  I understand that this reduces read latency, but not why the
process is so much faster overall.

2. What is the performance implication of setting
`process-adaptive-read-buffering' to nil?

The documentation for this variable mentions:

> On some systems, when Emacs reads the output from a subprocess, the
> output data is read in very small blocks, potentially resulting in very
> poor performance.

But I was able to use Emacs normally (typing, calling M-x etc) when the
previews were being updated, which is great.  So I'm not sure what the
performance implications are here.

Karthik

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-28  7:54     ` Ihor Radchenko
@ 2023-07-28 21:51       ` Karthik Chikmagalur
  2023-07-29  6:04         ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Karthik Chikmagalur @ 2023-07-28 21:51 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Jens Schmidt, emacs-devel

> Do you mean that
>
>  -v, --verbosity=level
>            Controls the type of messages printed during a dvisvgm run:
>            0   no message output at all
>            1   error messages only
>            2   warning messages only
>            4   informational messages only
>
>                Note By adding these values you can combine the
>                categories. The default level is 7, i.e. all messages
>                are printed.
>
> is not sufficient?

Yes.  The text in the example stdout(s) I included are all generated by
4, "informational messages only".

>> Any ideas on how to avoid this throttling would be appreciated.
>
> One possibility can be redirecting the strerr to file, waiting for the
> dvisvgm process to finish, and then examining the file.

This would be the same as shifting the Emacs-side processing to the
dvisvgm process sentinel instead of the process filter.  We want to
populate the in-buffer previews as they are generated because this is a
much more responsive experience.  Fragments visible in the window are
previewed first, so the preview run appears close to instantaneous even
if there are thousands of LaTeX fragments in the queue.

Karthik

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-28 21:42       ` Karthik Chikmagalur
@ 2023-07-29  6:02         ` Eli Zaretskii
  2023-07-29 22:16           ` Karthik Chikmagalur
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2023-07-29  6:02 UTC (permalink / raw)
  To: Karthik Chikmagalur; +Cc: jschmidt4gnu, emacs-devel

> From: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
> Cc: jschmidt4gnu@vodafonemail.de, emacs-devel@gnu.org
> Date: Fri, 28 Jul 2023 14:42:47 -0700
> 
> > I think it isn't that dvisvgm is waiting for Emacs, it's that the
> > whole process of reading output by Emacs and processing that output
> > takes longer.
> 
> This is what I assumed at first too, but I made the filter function
> a noop and still observed this behavior.

Setting the filter function to a no-op doesn't prevent Emacs from
processing the subprocess output.  Emacs still has to read all of the
output, and with the later TeXLive that output is larger, so it needs
more time to read it.

> >> 2.  Enlarge the buffer or "pipe" connecting dvisvgm to Emacs.  This
> >> stream buffer appears to be set to 4KB.  Since dvisvgm produces far more
> >> output (to stdout) than this between two successive instances of Emacs
> >> accepting process output, widening the pipe should relieve this
> >> pressure.  I tried tweaking `read-process-output-max' but this doesn't
> >> help.
> >
> > Which probably means that each time we get to check for subprocess
> > output, there's less than 4KB of stuff in the pipe ready to be read?
> > Did you look at the amount of bytes we read each time?  How many bytes
> > do we read, and does this number change if you change the value of
> > read-process-output-max?
> 
> In the following description,
> - `process-adaptive-read-buffering' is set to t
> - `read-process-output-max' is set to 65,536
> 
> 1.  I logged the length of the string that is passed to the filter
> function on each invocation.  On the first 3-4 calls, the length is
> variable, from 1 to 2000.  In the remaining 30 calls, the length is
> almost always 4095. 
> 
> 2. There are rare exceptions to this, when the length jumps up to
> 20,000.  This is usually for a single filter call out of ~35.
> 
> 3. There is no change to the above behavior when I change
> `read-process-output-max', although I didn't set it below 4096.

Is this with process-connection-type set to nil or non-nil?  If you
didn't try setting it to nil, please do, and see if the behavior
changes.

> > Also, did you try setting process-adaptive-read-buffering to nil?
> 
> Setting it to nil essentially fixes the problem!  The filter function is
> now called 80+ times instead of 35 times, the string it's called with
> each time is of variable length, generally under 500 characters, but
> overall the process is much, much faster.
> 
> Total preview time:
> | process-adaptive-read-buffering | t         | nil       |
> |---------------------------------+-----------+-----------|
> | TeXLive 2022                    | 2.65 secs | 1.88 secs |
> | TeXLive 2023                    | 4.03 secs | 1.80 secs |
> 
> We get a huge speed-up, up to 120%.

Then I guess you need to bind process-adaptive-read-buffering to nil
when you perform this processing.  (But please also see what happens
if process-connection-type is set to nil, thus using pipes instead of
PTY to read subprocess output: it could change the picture.)

> > So once again a problem generated by an external program, which cannot
> > be configured by that program, is blamed on Emacs?  Is this fair?
> 
> Eli, I apologize for giving the impression that I'm blaming this on
> Emacs.  Your phrasing ("once again") suggests that there's more to this
> than I'm aware of.

Emacs is very configurable, so people tend to expect us to solve
problems we didn't cause.

> My only goal here is to provide the most responsive user experience
> possible when writing LaTeX math in Org mode, under the constraint
> that this needs to be done by gluing together two (actually three)
> processes.  I care about where the bottleneck is -- or was -- only
> so that I may address it.

I understand, but clearly the change responsible for this was in
dvisvgm, not in Emacs, so expecting Emacs to rectify it is not really
fair.

> > My take of this is not that Emacs is "throttling" dvisvgm, but that
> > dvisvgm is "flooding" Emacs with useless data, and cannot be told to
> > shut up.
> 
> This is a fair take, but as it turns out from the new benchmarks, the
> total preview time is the same for the two dvisvgm versions once
> `process-adaptive-read-buffering' is set to nil.

So you were lucky to find a solution by tweaking Emacs this time.
That's good for you, but not every problem with some external program
can be fixed like that.

> > AFAICT, the data it outputs is completely useless in
> > production use cases, and (with the possible exception of the last
> > line, which shows the output file name) is basically debug-level info.
> 
> The filter function needs to read the output file name for the purposes
> of updating LaTeX previews in the buffer.  The rest is superfluous to us
> -- but it's possible that the sizing information is useful to other
> applications.

Then maybe ask the dvisvgm developers to provide the level of
verbosity that only shows the file name.  Emacs in general and Org in
particular are important applications, and so it would be reasonable
for the dvisvgm developers to cater to our needs, not only to the
needs of other programs.

> > Did you try to take this problem up with the dvisvgm developers?
> 
> I was planning to, but it looks like TeXLive 2023 is actually slightly
> faster now, so the extra stdout makes no difference.

Well, it definitely makes the difference to Emacs!  Other Emacs
features which may need to use dvisvgm might not be able to use the
solution you found, for whatever reasons, so being able to make
dvisvgm be as silent as Emacs needs is a Good Thing.

> > And in any case, we are talking about 0.4 sec delay wrt the older
> > version of TeXLive, right?  Is this really large enough to worry
> > about?
> 
> The delay was about 40%, which could be 1+ second with lots of LaTeX
> fragments, as in my original benchmark.  Moreover, even 200ms makes a
> noticeable difference when
> 
> 1. opening Org files that have the latexpreview startup option set,
> 
> 2. live-previewing equations with a quick feedback loop: See
> https://tinyurl.com/ms2ksthc
> 
> 3. and when tweaking equation numbering etc, which can cause all
> subsequent LaTeX fragments in the buffer to have their previews
> regenerated.

I very much doubt that 0.2 sec makes a difference when the overall
time is about 4 sec.

> >> 1.  Reduce the duration between successive calls of the filter function.
> >> Is this configurable in Emacs?  I don't see anything relevant in the
> >> manual sections on accepting output from processes or filter functions.
> >
> > This is not configurable for the simple reason that Emacs checks for
> > subprocess output every time its main loop gets to that, so basically
> > Emacs does that as fast as possible, assuming it is idle, i.e. no
> > other command or timer runs.
> 
> Because of `process-adaptive-read-buffering', I'm not sure this is all
> there is to it, because I wouldn't expect a difference in how long the
> preview generation run takes otherwise.  Am I correct?  The
> documentation for this variable states:
> 
> "If non-nil, improve receive buffering by delaying after short reads."

I didn't assume dvisvgm will start by sending very small chunks of
text, so I didn't think the delays introduced by
process-adaptive-read-buffering would matter in your case.  In any
case, Emacs _checks_ for subprocess output as frequently as possible,
it just sometimes refrains from reading it (and thus refrains from
calling the filter), if the adaptive-reading feature requires that.

> 1. So why is the run faster with `process-adaptive-read-buffering' set
> to nil?  I understand that this reduces read latency, but not why the
> process is so much faster overall.

Because we don't delay reading subprocess output, but instead process
it as it comes in, regardless of the size.

> 2. What is the performance implication of setting
> `process-adaptive-read-buffering' to nil?

Our long-time experience is that the default non-nil value is better
overall.  So my suggestion is to change its value if you have a use
case where that is beneficial, but otherwise leave the default alone.

> The documentation for this variable mentions:
> 
> > On some systems, when Emacs reads the output from a subprocess, the
> > output data is read in very small blocks, potentially resulting in very
> > poor performance.
> 
> But I was able to use Emacs normally (typing, calling M-x etc) when the
> previews were being updated, which is great.  So I'm not sure what the
> performance implications are here.

You don't want to rewind the long history of this and re-live all the
issues we experienced and investigated before we settled on the
current default.  If you can solve this particular problem by binding
the variable to nil around the code which invokes dvisvgm, that is all
I suggest doing.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-28 21:51       ` Karthik Chikmagalur
@ 2023-07-29  6:04         ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2023-07-29  6:04 UTC (permalink / raw)
  To: Karthik Chikmagalur; +Cc: yantar92, jschmidt4gnu, emacs-devel

> From: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
> Cc: Jens Schmidt <jschmidt4gnu@vodafonemail.de>, emacs-devel@gnu.org
> Date: Fri, 28 Jul 2023 14:51:00 -0700
> 
> > Do you mean that
> >
> >  -v, --verbosity=level
> >            Controls the type of messages printed during a dvisvgm run:
> >            0   no message output at all
> >            1   error messages only
> >            2   warning messages only
> >            4   informational messages only
> >
> >                Note By adding these values you can combine the
> >                categories. The default level is 7, i.e. all messages
> >                are printed.
> >
> > is not sufficient?
> 
> Yes.  The text in the example stdout(s) I included are all generated by
> 4, "informational messages only".

Then I guess another level, which will only output the file name, is
in order.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-29  6:02         ` Eli Zaretskii
@ 2023-07-29 22:16           ` Karthik Chikmagalur
  2023-07-30  5:14             ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Karthik Chikmagalur @ 2023-07-29 22:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jschmidt4gnu, emacs-devel

>> Setting it to nil essentially fixes the problem!  The filter function is
>> now called 80+ times instead of 35 times, the string it's called with
>> each time is of variable length, generally under 500 characters, but
>> overall the process is much, much faster.
>> 
>> Total preview time:
>> | process-adaptive-read-buffering | t         | nil       |
>> |---------------------------------+-----------+-----------|
>> | TeXLive 2022                    | 2.65 secs | 1.88 secs |
>> | TeXLive 2023                    | 4.03 secs | 1.80 secs |
>> 
>> We get a huge speed-up, up to 120%.
>
> Then I guess you need to bind process-adaptive-read-buffering to nil
> when you perform this processing.  (But please also see what happens
> if process-connection-type is set to nil, thus using pipes instead of
> PTY to read subprocess output: it could change the picture.)

I've tested out all combinations of pty/pipe and adaptive-read/no
adaptive read now, here is a full picture of what the filter calls look
like: https://i.imgur.com/MkszXbT.png

(mirror for future reference if the above link is dead:
https://tinyurl.com/yvwwkbc9)

Each block is a call to the filter function.  
- The horizontal position is when in the timeline it is called.
- The height of each block is the amount of data it is called with.
- The width of each block is how long the filter takes to run.

The abnormally wide blocks might involve garbage collection, I don't
know how to account for that.  The largest time gaps between filter
function calls are also probably because of GC, because I turned off as
many timers as I could and there was no user input during the tests.

Here's what I've discerned from this picture:

1. A pty connection + adaptive-read is the slowest (by a fair bit).  All
other combinations are equally fast.

2. Despite the value of read-process-output-max, a pty connection calls
the filter function with at most 4 KB of data most of the time.

3. With a pipe connection, we indeed see reads of up to
read-process-output-max (64 KB in my tests).

4. Which combination is best depends on what we value more: Emacs'
degree of responsiveness during the generation or the overall preview
generation time.
   - pty + adaptive read is the slowest but Emacs remains fully
     responsive.
   - A pipe connection + adaptive-read is fast but the least responsive
     because of the large reads + associated processing.
   - Setting process-adaptive-read-buffering to nil is a reasonable
     balance of speed and responsiveness.

>> My only goal here is to provide the most responsive user experience
>> possible when writing LaTeX math in Org mode, under the constraint
>> that this needs to be done by gluing together two (actually three)
>> processes.  I care about where the bottleneck is -- or was -- only
>> so that I may address it.
>
> I understand, but clearly the change responsible for this was in
> dvisvgm, not in Emacs, so expecting Emacs to rectify it is not really
> fair.

On reflection what I said isn't correct -- this was not my only goal.  I
also wanted to understand some of the intricacies of subprocess
communication in Emacs, which your comments have helped me with.  My
prior knowledge of it was all theoretical and limited to reading the
Elisp manual sections on the Command Loop and "Receiving Output From
Processes".

>> The filter function needs to read the output file name for the purposes
>> of updating LaTeX previews in the buffer.  The rest is superfluous to us
>> -- but it's possible that the sizing information is useful to other
>> applications.
>
> Then maybe ask the dvisvgm developers to provide the level of
> verbosity that only shows the file name.  Emacs in general and Org in
> particular are important applications, and so it would be reasonable
> for the dvisvgm developers to cater to our needs, not only to the
> needs of other programs.

Yes, this is on the docket for me.

> I very much doubt that 0.2 sec makes a difference when the overall
> time is about 4 sec.

The difference in preview generation time was about 1.4 seconds (2.6 ->
4.0), so I'm not sure why you mentioned 0.4.

>> Because of `process-adaptive-read-buffering', I'm not sure this is all
>> there is to it, because I wouldn't expect a difference in how long the
>> preview generation run takes otherwise.  Am I correct?  The
>> documentation for this variable states:
>> 
>> "If non-nil, improve receive buffering by delaying after short reads."
>
> I didn't assume dvisvgm will start by sending very small chunks of
> text, so I didn't think the delays introduced by
> process-adaptive-read-buffering would matter in your case.

Indeed, dvisvgm's behavior misled me.  We can see the small initial
chunks as the early tiny spikes in the charts at the above link.

> In any case, Emacs _checks_ for subprocess output as frequently as
> possible, it just sometimes refrains from reading it (and thus
> refrains from calling the filter), if the adaptive-reading feature
> requires that.

This is good to know for the future.

>> 1. So why is the run faster with `process-adaptive-read-buffering' set
>> to nil?  I understand that this reduces read latency, but not why the
>> process is so much faster overall.
>
> Because we don't delay reading subprocess output, but instead process
> it as it comes in, regardless of the size.

This makes sense, thank you for clarifying.

>> But I was able to use Emacs normally (typing, calling M-x etc) when the
>> previews were being updated, which is great.  So I'm not sure what the
>> performance implications are here.
>
> You don't want to rewind the long history of this and re-live all the
> issues we experienced and investigated before we settled on the
> current default.  If you can solve this particular problem by binding
> the variable to nil around the code which invokes dvisvgm, that is all
> I suggest doing.

I agree that it's not important, I was just curious since I didn't
notice any immediate issues.  I'll search the archives for prior
dicussions about this when I have more time.  For now I'm just going to
remember to check process-adaptive-read-buffering when I care about
latency.

Karthik



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
  2023-07-29 22:16           ` Karthik Chikmagalur
@ 2023-07-30  5:14             ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2023-07-30  5:14 UTC (permalink / raw)
  To: Karthik Chikmagalur; +Cc: jschmidt4gnu, emacs-devel

> From: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
> Cc: jschmidt4gnu@vodafonemail.de, emacs-devel@gnu.org
> Date: Sat, 29 Jul 2023 15:16:49 -0700
> 
> 1. A pty connection + adaptive-read is the slowest (by a fair bit).  All
> other combinations are equally fast.

It is a general recommendation to use pipes when the subprocess
produces large amounts of output, and Emacs doesn't need to emulate
interactive behavior vis-a-vis that subprocess' program.  PTYs are
less pertinent for high-throughput applications.

> 2. Despite the value of read-process-output-max, a pty connection calls
> the filter function with at most 4 KB of data most of the time.

I think this is a limitation of the Linux kernel's implementation of
PTYs.

> 3. With a pipe connection, we indeed see reads of up to
> read-process-output-max (64 KB in my tests).

FTR: if someone has to do something similar on MS-Windows, they will
need to enlarge the value of w32-pipe-buffer-size accordingly.

> >> But I was able to use Emacs normally (typing, calling M-x etc) when the
> >> previews were being updated, which is great.  So I'm not sure what the
> >> performance implications are here.
> >
> > You don't want to rewind the long history of this and re-live all the
> > issues we experienced and investigated before we settled on the
> > current default.  If you can solve this particular problem by binding
> > the variable to nil around the code which invokes dvisvgm, that is all
> > I suggest doing.
> 
> I agree that it's not important, I was just curious since I didn't
> notice any immediate issues.

the issues are not always immediate, and don't became apparent until
you see enough subprocesses of different kinds, throughputs, and
behavioral aspects.  In the long history of this variable, its default
changed at least twice, AFAIR.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Understanding filter function calls
@ 2024-04-18  3:52 Karthik Chikmagalur
  0 siblings, 0 replies; 14+ messages in thread
From: Karthik Chikmagalur @ 2024-04-18  3:52 UTC (permalink / raw)
  To: eliz, emacs-devel

> > > AFAICT, the data it outputs is completely useless in
> > > production use cases, and (with the possible exception of the last
> > > line, which shows the output file name) is basically debug-level info.
> > 
> > The filter function needs to read the output file name for the purposes
> > of updating LaTeX previews in the buffer.  The rest is superfluous to us
> > -- but it's possible that the sizing information is useful to other
> > applications.
>  
> Then maybe ask the dvisvgm developers to provide the level of
> verbosity that only shows the file name.  Emacs in general and Org in
> particular are important applications, and so it would be reasonable
> for the dvisvgm developers to cater to our needs, not only to the
> needs of other programs.

An update from the future: we asked the dvisvgm maintainer to provide a
customizable stdout message option and it was added in dvisvgm 3.2.
This has made the Emacs side of things great.  The filter function
is no longer the bottleneck, dvisvgm itself is.

Karthik



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-04-18  3:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-24  5:46 Understanding filter function calls Karthik Chikmagalur
2023-07-24 20:57 ` Jens Schmidt
2023-07-27 21:08   ` Karthik Chikmagalur
2023-07-27 21:44     ` Karthik Chikmagalur
2023-07-28  5:47       ` Eli Zaretskii
2023-07-28  5:44     ` Eli Zaretskii
2023-07-28 21:42       ` Karthik Chikmagalur
2023-07-29  6:02         ` Eli Zaretskii
2023-07-29 22:16           ` Karthik Chikmagalur
2023-07-30  5:14             ` Eli Zaretskii
2023-07-28  7:54     ` Ihor Radchenko
2023-07-28 21:51       ` Karthik Chikmagalur
2023-07-29  6:04         ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2024-04-18  3:52 Karthik Chikmagalur

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.