unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Imran Khan <contact@imrankhan.live>
Cc: 48734@debbugs.gnu.org
Subject: bug#48734: 28.0.50; Performance regression in `string-width`?
Date: Sun, 30 May 2021 09:42:29 +0300	[thread overview]
Message-ID: <83o8cs4t9m.fsf@gnu.org> (raw)
In-Reply-To: <87a6odmfp6.fsf@teknik.io> (message from Imran Khan on Sun, 30 May 2021 02:45:57 +0600)

> From: Imran Khan <contact@imrankhan.live>
> Date: Sun, 30 May 2021 02:45:57 +0600
> 
> A package I use (deft-mode) has been hanging for minutes with high cpu
> use recently. Profiler says most time is spent in `string-width`, and
> upon looking it seems to happen in files that have multibyte characters
> in them.
> 
> I reproduced the problem by creating a file that has both single and
> multi byte characters:
> 
> with open("/tmp/test", "w") as f:
>     for i in range(50_000):
>         print("1", file=f, end="")
>     print("α", file=f, end="")
> 
> And now:
> 
> (benchmark-run 1
>   (let ((str))
>     (with-temp-buffer
>       (insert-file-contents-literally "/tmp/test")
>       (setq str (buffer-string)))
>     (string-width str)))
> 
> This takes 20 seconds in my machine (if string is exclusively full of
> either single or multibyte characters, weirdly it seems to finish
> instantly).

Since you use insert-file-contents-literally, why don't you also make
the temporary buffer unibyte?  That is:

  (benchmark-run 1
    (let ((str))
      (with-temp-buffer
	(set-buffer-multibyte nil)  ; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<
	(insert-file-contents-literally "/tmp/test")
	(setq str (buffer-string)))
      (string-width str)))

Or maybe I don't understand your real-life use case?  Because if you
treat the file as a raw bytestream, why do you need to compute the
width of its text?





  reply	other threads:[~2021-05-30  6:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-29 20:45 bug#48734: 28.0.50; Performance regression in `string-width`? Imran Khan
2021-05-30  6:42 ` Eli Zaretskii [this message]
     [not found]   ` <87y2bwk1nj.fsf@teknik.io>
2021-05-30 10:00     ` Eli Zaretskii
2021-05-30 11:23       ` Imran Khan
2021-05-30 12:05         ` Eli Zaretskii
2021-05-30 12:18           ` Lars Ingebrigtsen
2021-05-30 13:32             ` Eli Zaretskii
2021-05-31  5:41               ` Lars Ingebrigtsen
2021-05-31 12:36                 ` Imran Khan
2021-05-31 14:28                   ` Eli Zaretskii
2021-05-31 18:51                     ` Eli Zaretskii
2021-06-05 11:20                       ` Eli Zaretskii
2021-06-05 15:25                         ` Imran Khan
2021-06-05 15:45                           ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83o8cs4t9m.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=48734@debbugs.gnu.org \
    --cc=contact@imrankhan.live \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).