all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: JD Smith <jdtsmith@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org, casouri@gmail.com
Subject: Re: Mitigating the long-lines penalty in vundo
Date: Sun, 17 Dec 2023 12:49:27 -0500	[thread overview]
Message-ID: <186F0A42-25E1-4A12-B6A1-F5B0F1465447@gmail.com> (raw)
In-Reply-To: <838r5tl23k.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 6257 bytes --]



> On Dec 17, 2023, at 2:28 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> 
>> 
>> Looking closer, the only relevant thing in draw-tree is (1- (current-column)).  Adding that to my elp list
>> and... we have found the principal culprit (here with 10K edits):
>> 
>> vundo--draw-tree  1           9.65838       9.65838
>> current-column    10001       9.124152      0.0009123239
>> 
>> `current-column’ is apparently pretty expensive here (~1ms).
> 
> Yes, I think current-column is the culprit here.

Indeed it was, thanks for pointing us in the right direction.  Luckily there was no need to call current-column on every node, as vundo had been doing, but only in the rare case when a branch needed to bend around a preexisting one — an easy fix that improved launch speed by 25x or more. 

That said, with long lines (>10-20K columns, say) visible, the problem remains that overall Emacs is mildly but noticeably less responsive, in the mini-buffer, in other buffers, etc., until the vundo buffer is hidden.  Perhaps that’s just the minimum penalty paid for such long lines.

>> The current-column docs do say:
>> 
>> In a buffer with very long lines, the value will be an approximation,
>> because calculating the exact number is very expensive.
> 
> The "very long lines" case happens when the line's length exceeds the
> long-line-threshold, which in your case it most probably doesn't
> (unless you lower long-line-threshold considerably).  So this part of
> the doc string is not relevant to your problem.

I was only referring to the “very expensive” part. 

>> I just tested it in a temp buffer with width 10K, and it’s much less expensive (60-100µs).  Perhaps the
>> text properties on each node character (containing the defstruct node object) slows current-column
>> down even further.
> 
> Yes. The problem is that vundo.el uses 'display' properties _a_lot_,
> and those slow down current-column, because each time it sees a
> 'display' property whose value is a string, it needs to extract the
> property value and calculate the width of that string, to account for
> its width on display.

That would have indeed been a good explanation for the additional slowness, but in fact vundo does not use 'display properties:

	(next-single-property-change (point-min) ‘display) -> nil

It uses simple characters with faces, and on each “node” character it also attaches a special 'vundo-node text property pointing to the node data at point (a cl-defstruct type vundo-m).

> This is basically an algorithmic problem in vundo.el: since the width
> of each string that vundo--translate is fixed and can be computed in
> advance, vundo.el could count the columns itself (by adding those
> known widths), instead of asking current-column to measure the width
> of those strings over and over and over again.

Indeed this was a vundo problem in that there was no need to call current-column over and over for each node.  But the basic slowness of current-column remains, so unless there are low-hanging improvements to be made there, algorithms that need frequent column calculations would need to try other approaches.

> Alternatively, instead of using 'display' properties, vundo.el could
> insert the characters produced by vundo--translate directly into the
> buffer, which would allow current-column to work at its usual speed.

This is in fact what vundo already does.  The problem is rather that current-column slows down for some types of buffer content (see below).

>> So it is a long-lines problem of a sort, just not display related.
> 
> I disagree.  it's a problem with too many 'display' properties in the
> buffer, which is not recommended for more than one reason.  I suggest
> to rethink this design.

Since no 'display properties were used (other than in a single overlay for indicating the current node), something else is causing the current-column quadratic slowdown for long lines.  To investigate, I made a generic test of current-column speed on 18K long lines; see below.

What I conclude from this test is:

Current column is reasonably fast counting long lines of plain text — 175ms total to compute each column at 6000 nodes spread over 18K columns.
current-column is quite slow counting unicode characters: about 21x slower than the ascii equivalents.  Perhaps this is because unicode chars can have non-standard widths, so emacs needs to do some additional checks?
But this isn’t the full story, because current-column is also (separately) quite slow counting plain characters with attached faces: 33x slower than the fast case. This one seems less intuitive to me. Can faces also (in principle) alter the character width (like italic for example)?
Combining these two (unicode chars with alternating faces) is the worst case — 43x slower than the fast case of unadorned plain chars.  This was vundo’s case.
Conclusion: unless some improvements can be made, since current-column can get quite slow for long lines of many chars with faces (especially unicode chars), it is best to use it judiciously.

Thanks again for looking into this.

++++
(require 'elp)
(elp-instrument-function #'current-column)

(elp-reset-all)
(with-temp-buffer
  (dotimes (_ 6000)
    (insert "*")
    (insert "--")
    (current-column))
  (elp-results))
;; 1. Plain chars, no face -- average time: 2.906...e-05

(elp-reset-all)
(with-temp-buffer
  (dotimes (_ 6000)
    (insert "○")
    (insert "──")
    (current-column))
  (elp-results))
;; 2. Unicode chars, no face -- average time: 0.0006181615, 21.3x slower than 1

(elp-reset-all)
(with-temp-buffer
  (dotimes (_ 6000)
    (insert (propertize "*" 'face 'bold))
    (insert (propertize "--" 'face 'font-lock-regexp-face))
    (current-column))
  (elp-results))
;; 3. Plain chars, alternating faces -- average time: 0.0009658896, 33.2x slower than 1

(elp-reset-all)
(with-temp-buffer
  (dotimes (_ 6000)
    (insert (propertize "○" 'face 'bold))
    (insert (propertize "──" 'face 'font-lock-regexp-face))
    (current-column))
  (elp-results))
;; 4. Unicode chars, alternative faces -- average time: 0.0012507139, 43x slower than 1


[-- Attachment #2: Type: text/html, Size: 9440 bytes --]

  reply	other threads:[~2023-12-17 17:49 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-16 16:19 Mitigating the long-lines penalty in vundo JD Smith
2023-12-16 16:45 ` Eli Zaretskii
2023-12-16 18:54   ` JD Smith
2023-12-16 19:13     ` Eli Zaretskii
2023-12-16 20:32       ` Stephen Berman
2023-12-16 22:00       ` JD Smith
2023-12-17  7:28         ` Eli Zaretskii
2023-12-17 17:49           ` JD Smith [this message]
2023-12-17 18:07             ` Eli Zaretskii
2023-12-17 19:26               ` JD Smith
2023-12-17 19:43                 ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=186F0A42-25E1-4A12-B6A1-F5B0F1465447@gmail.com \
    --to=jdtsmith@gmail.com \
    --cc=casouri@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.