From: Eli Zaretskii <eliz@gnu.org>
To: Imran Khan <contact@imrankhan.live>
Cc: 48734@debbugs.gnu.org
Subject: bug#48734: 28.0.50; Performance regression in `string-width`?
Date: Sun, 30 May 2021 09:42:29 +0300 [thread overview]
Message-ID: <83o8cs4t9m.fsf@gnu.org> (raw)
In-Reply-To: <87a6odmfp6.fsf@teknik.io> (message from Imran Khan on Sun, 30 May 2021 02:45:57 +0600)
> From: Imran Khan <contact@imrankhan.live>
> Date: Sun, 30 May 2021 02:45:57 +0600
>
> A package I use (deft-mode) has been hanging for minutes with high cpu
> use recently. Profiler says most time is spent in `string-width`, and
> upon looking it seems to happen in files that have multibyte characters
> in them.
>
> I reproduced the problem by creating a file that has both single and
> multi byte characters:
>
> with open("/tmp/test", "w") as f:
> for i in range(50_000):
> print("1", file=f, end="")
> print("α", file=f, end="")
>
> And now:
>
> (benchmark-run 1
> (let ((str))
> (with-temp-buffer
> (insert-file-contents-literally "/tmp/test")
> (setq str (buffer-string)))
> (string-width str)))
>
> This takes 20 seconds in my machine (if string is exclusively full of
> either single or multibyte characters, weirdly it seems to finish
> instantly).
Since you use insert-file-contents-literally, why don't you also make
the temporary buffer unibyte? That is:
(benchmark-run 1
(let ((str))
(with-temp-buffer
(set-buffer-multibyte nil) ; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<
(insert-file-contents-literally "/tmp/test")
(setq str (buffer-string)))
(string-width str)))
Or maybe I don't understand your real-life use case? Because if you
treat the file as a raw bytestream, why do you need to compute the
width of its text?
next prev parent reply other threads:[~2021-05-30 6:42 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-29 20:45 bug#48734: 28.0.50; Performance regression in `string-width`? Imran Khan
2021-05-30 6:42 ` Eli Zaretskii [this message]
[not found] ` <87y2bwk1nj.fsf@teknik.io>
2021-05-30 10:00 ` Eli Zaretskii
2021-05-30 11:23 ` Imran Khan
2021-05-30 12:05 ` Eli Zaretskii
2021-05-30 12:18 ` Lars Ingebrigtsen
2021-05-30 13:32 ` Eli Zaretskii
2021-05-31 5:41 ` Lars Ingebrigtsen
2021-05-31 12:36 ` Imran Khan
2021-05-31 14:28 ` Eli Zaretskii
2021-05-31 18:51 ` Eli Zaretskii
2021-06-05 11:20 ` Eli Zaretskii
2021-06-05 15:25 ` Imran Khan
2021-06-05 15:45 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83o8cs4t9m.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=48734@debbugs.gnu.org \
--cc=contact@imrankhan.live \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).