From: Ihor Radchenko <yantar92@posteo.net>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 63040@debbugs.gnu.org, Stefan Monnier <monnier@iro.umontreal.ca>
Subject: bug#63040: 30.0.50; Performance of buf_bytepos_to_charpos when a buffer has large number of markers
Date: Mon, 24 Apr 2023 11:17:59 +0000 [thread overview]
Message-ID: <87ildlttp4.fsf@localhost> (raw)
In-Reply-To: <83r0s9y22y.fsf@gnu.org>
Eli Zaretskii <eliz@gnu.org> writes:
>> Now, it is clearly not efficient enough for my large file.
>
> Why do you say that? Did you try something and the results were
> unsatisfactory? And what is not efficient enough -- the cutoff based
> on the number of markers tested or based on the distance?
Sorry for not being clear.
I was referring to the existing code.
"BYTECHAR_DISTANCE_INCREMENT" alone is clearly not efficient in my use
case because introducing an addition 50 cut-off improved the performance
significantly. Hence, there is some room for improvement in this area.
>> Further, the later code creates markers to cache recent results and
>> cutting too early may waste this cache.
>
> And the technique that you tried doesn't waste the cache?
I was talking about my technique. It is wasting the cache, and it is the
reason why I think that we should find a better approach; not the one I
used in the patch.
>> Another idea could be moving the cache markers into a separate
>> array, so that we can examine them without mixing with all other
>> buffer markers.
>
> Why would that separation be useful?
Because the markers created by buf_bytepos_to_charpos are at least 5000
bytes apart. There is no such guarantee for other buffer markers.
The while loops "while (best_below_byte < bytepos)" used as fallback
(when no nearby marker is found) traverse the buffer char-by-char
"("best_below_byte += buf_next_char_len (b, best_below_byte);" and
should be strictly inferior compared to well-spaced marker list.
However, when the marker list is not well-spaced, looping over all the
buffer markers can be a waste. And it looks like I hit exactly such
scenario in my setup.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
next prev parent reply other threads:[~2023-04-24 11:17 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-23 19:41 bug#63040: 30.0.50; Performance of buf_bytepos_to_charpos when a buffer has large number of markers Ihor Radchenko
2023-04-24 2:24 ` Eli Zaretskii
2023-04-24 6:36 ` Ihor Radchenko
2023-04-24 11:02 ` Eli Zaretskii
2023-04-24 11:03 ` Eli Zaretskii
2023-04-24 11:17 ` Ihor Radchenko [this message]
2024-06-25 21:04 ` Stefan Monnier
2024-06-26 12:47 ` Ihor Radchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ildlttp4.fsf@localhost \
--to=yantar92@posteo.net \
--cc=63040@debbugs.gnu.org \
--cc=eliz@gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).