unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: JD Smith <jdtsmith@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org, casouri@gmail.com
Subject: Re: Mitigating the long-lines penalty in vundo
Date: Sun, 17 Dec 2023 14:26:23 -0500	[thread overview]
Message-ID: <722361DD-7F6C-42EC-921A-348285895242@gmail.com> (raw)
In-Reply-To: <83a5q8k8i4.fsf@gnu.org>



> On Dec 17, 2023, at 1:07 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> I do see 'display' properties being used:
> 
>  (defun vundo--highlight-node (node)
>       <snip>
> (overlay-put vundo--highlight-overlay
>   'display (vundo--translate "●”)) ...

That’s just a single display property on the one character wide overlay used to indicate the current node.  The test showed faces and unicode are what lead to the significant current-column slowdown (though I wouldn’t be surprised if ‘display would make it yet worse!).

> Lisp programs are not supposed to need frequent column calculations,
> certainly not several times on the same line.

Maybe then it’s worth a word of warning in the current-column docs mentioning this in the context of faces, unicode, and long-line performance?

>> * current-column is quite slow counting unicode characters: about 21x slower than the ascii
>> equivalents.  Perhaps this is because unicode chars can have non-standard widths, so emacs
>> needs to do some additional checks?
> 
> No.  I can understand why non-ASCII characters are somewhat slower,
> since each one of them is several bytes, not just one, and we convert
> the multibyte representation into a codepoint on the fly.  But 21-fold
> slowdown sounds excessive.

Results may vary by system/font I suppose; perhaps others could run the simple test in my last message (no non-default packages required) and report back what slowdown factor they get.

>> * But this isn’t the full story, because current-column is also (separately) quite slow counting plain
>> characters with attached faces: 33x slower than the fast case. This one seems less intuitive to me.
>> Can faces also (in principle) alter the character width (like italic for example)?
> 
> When buffer text has faces (or any other text properties, really)
> current-column uses a slower algorithm, because the faster one will
> yield incorrect results.

I see.  I had noticed the slowdown does not depend on whether faces alternate or just a single face is used, so this makes sense.

>> * Combining these two (unicode chars with alternating faces) is the worst case — 43x slower than
>> the fast case of unadorned plain chars.  This was vundo’s case.
> 
> Then vundo is better redesigned to avoid such massive reliance on
> current-column.

Yes indeed, and it has been already!  The fix was quite trivial [1].  I was speaking more generally to current-column performance.  You could imagine algorithms that legitimately do need to keep a dynamically updating measure of the column.

> There's one more factor to consider here: current-column always starts
> from the beginning of the line.  If vundo needs to call current-column
> many times on the same line, it loses because it each time starts from
> BOL, in effect throwing away the results of the previous calculation.
> This wastes CPU cycles.  It might be better to use sum char-width
> values of individual characters instead.

Definitely; it’s clear how repetitive/wasteful this is.  I wonder, similar to line-numbers, could we cache current-column at various points along a line and apply a delta?  I.e. for computing line numbers differentially we have

(count-lines start end) 

which linum and others use to good effect.  Is there a similar

(count-columns start end)?  

This could be useful for algorithms that truly need frequent, up-to-date column info (not vundo, luckily).

Thanks for your help and analysis.

[1] https://github.com/casouri/vundo/pull/85




  reply	other threads:[~2023-12-17 19:26 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-16 16:19 Mitigating the long-lines penalty in vundo JD Smith
2023-12-16 16:45 ` Eli Zaretskii
2023-12-16 18:54   ` JD Smith
2023-12-16 19:13     ` Eli Zaretskii
2023-12-16 20:32       ` Stephen Berman
2023-12-16 22:00       ` JD Smith
2023-12-17  7:28         ` Eli Zaretskii
2023-12-17 17:49           ` JD Smith
2023-12-17 18:07             ` Eli Zaretskii
2023-12-17 19:26               ` JD Smith [this message]
2023-12-17 19:43                 ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=722361DD-7F6C-42EC-921A-348285895242@gmail.com \
    --to=jdtsmith@gmail.com \
    --cc=casouri@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).