bug#63040: 30.0.50; Performance of buf_bytepos_to_charpos when a buffer has large number of markers

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

From: Ihor Radchenko <yantar92@posteo.net>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 63040@debbugs.gnu.org, Stefan Monnier <monnier@iro.umontreal.ca>
Subject: bug#63040: 30.0.50; Performance of buf_bytepos_to_charpos when a buffer has large number of markers
Date: Mon, 24 Apr 2023 11:17:59 +0000	[thread overview]
Message-ID: <87ildlttp4.fsf@localhost> (raw)
In-Reply-To: <83r0s9y22y.fsf@gnu.org>

Eli Zaretskii <eliz@gnu.org> writes:

>> Now, it is clearly not efficient enough for my large file.
>
> Why do you say that?  Did you try something and the results were
> unsatisfactory?  And what is not efficient enough -- the cutoff based
> on the number of markers tested or based on the distance?

Sorry for not being clear.

I was referring to the existing code.
"BYTECHAR_DISTANCE_INCREMENT" alone is clearly not efficient in my use
case because introducing an addition 50 cut-off improved the performance
significantly. Hence, there is some room for improvement in this area.

>> Further, the later code creates markers to cache recent results and
>> cutting too early may waste this cache.
>
> And the technique that you tried doesn't waste the cache?

I was talking about my technique. It is wasting the cache, and it is the
reason why I think that we should find a better approach; not the one I
used in the patch.

>> Another idea could be moving the cache markers into a separate
>> array, so that we can examine them without mixing with all other
>> buffer markers.
>
> Why would that separation be useful?

Because the markers created by buf_bytepos_to_charpos are at least 5000
bytes apart. There is no such guarantee for other buffer markers.

The while loops "while (best_below_byte < bytepos)" used as fallback
(when no nearby marker is found) traverse the buffer char-by-char
"("best_below_byte += buf_next_char_len (b, best_below_byte);" and
should be strictly inferior compared to well-spaced marker list.

However, when the marker list is not well-spaced, looping over all the
buffer markers can be a waste. And it looks like I hit exactly such
scenario in my setup.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

next prev parent reply	other threads:[~2023-04-24 11:17 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-23 19:41 bug#63040: 30.0.50; Performance of buf_bytepos_to_charpos when a buffer has large number of markers Ihor Radchenko
2023-04-24  2:24 ` Eli Zaretskii
2023-04-24  6:36   ` Ihor Radchenko
2023-04-24 11:02     ` Eli Zaretskii
2023-04-24 11:03     ` Eli Zaretskii
2023-04-24 11:17       ` Ihor Radchenko [this message]
2024-06-25 21:04 ` Stefan Monnier
2024-06-26 12:47   ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ildlttp4.fsf@localhost \
    --to=yantar92@posteo.net \
    --cc=63040@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).