From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Imran Khan Newsgroups: gmane.emacs.bugs Subject: bug#48734: 28.0.50; Performance regression in `string-width`? Date: Sun, 30 May 2021 17:23:15 +0600 Message-ID: <87v970jwik.fsf@teknik.io> References: <87a6odmfp6.fsf@teknik.io> <83o8cs4t9m.fsf@gnu.org> <87y2bwk1nj.fsf@teknik.io> <83eedo4k3j.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13261"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 48734@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun May 30 18:06:10 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lnNx3-0003Gg-WC for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 30 May 2021 18:06:10 +0200 Original-Received: from localhost ([::1]:40082 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lnNx3-0000Lm-2r for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 30 May 2021 12:06:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:47102) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lnNww-0000Jr-6z for bug-gnu-emacs@gnu.org; Sun, 30 May 2021 12:06:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:48675) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lnNwv-00048q-W2 for bug-gnu-emacs@gnu.org; Sun, 30 May 2021 12:06:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lnNwv-0002NS-QY for bug-gnu-emacs@gnu.org; Sun, 30 May 2021 12:06:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Imran Khan Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 30 May 2021 16:06:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 48734 X-GNU-PR-Package: emacs Original-Received: via spool by 48734-submit@debbugs.gnu.org id=B48734.16223907469110 (code B ref 48734); Sun, 30 May 2021 16:06:01 +0000 Original-Received: (at 48734) by debbugs.gnu.org; 30 May 2021 16:05:46 +0000 Original-Received: from localhost ([127.0.0.1]:60221 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lnNwf-0002Mr-Ef for submit@debbugs.gnu.org; Sun, 30 May 2021 12:05:45 -0400 Original-Received: from mail-pj1-f48.google.com ([209.85.216.48]:46892) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lnJXQ-0007Cj-7N for 48734@debbugs.gnu.org; Sun, 30 May 2021 07:23:27 -0400 Original-Received: by mail-pj1-f48.google.com with SMTP id pi6-20020a17090b1e46b029015cec51d7cdso5046578pjb.5 for <48734@debbugs.gnu.org>; Sun, 30 May 2021 04:23:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=34J6fA3ygaTtgs8DhYI/bhU/0rJtTVfNjhDqS8H/gyI=; b=FoqdVD7/G/cee92G44FPQW+1hgzw9CLjAqjaUWu97ZmrGl4jHi7UqXvnCGgJGJmdkx Si6BYZpkyHWMe8s+xQGuZEA11l28/sP4/2FOC5eq6JJZ6KQpTPWhi3r9fJ4ytVKjlo5b 4vct0IcpNNjKx2Bfw7BBvOjwIH4DKa5lsvSLvvHMmsQlXLjblCJ9OMF+/TO4q5/BsuVQ rvgcZwnWhrKsfjaxU8t2A3ZI0zHzxbr2T3ICBysEREiDUXfC5qHASuVzqsRrtWIdVJcf R7lzKJPQTxvUa/XnEpIzV766C9vAC6q0qSf3xXA/oXSc5tGhIRRNNih1KQSzff/P5dNx Z/+w== X-Gm-Message-State: AOAM531ZzePnJGFPpJtaA9c14M2sY2tH53/zlgjo+bGnSnZFA06Fzo2d vJqzQtlldrMLk0NlGMdvuB2uGuHwj9hJqw== X-Google-Smtp-Source: ABdhPJw9EofL0vQDgwK6JtqQaE9CAUePKLhurZKXN/yxS61rcCqwGQ/1TGLlyrHjQgQEa+gvro5ZRA== X-Received: by 2002:a17:902:9001:b029:ee:f24a:7e7d with SMTP id a1-20020a1709029001b02900eef24a7e7dmr15629174plp.42.1622373798440; Sun, 30 May 2021 04:23:18 -0700 (PDT) Original-Received: from localhost ([116.206.252.68]) by smtp.gmail.com with ESMTPSA id s12sm8616207pji.5.2021.05.30.04.23.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 30 May 2021 04:23:17 -0700 (PDT) In-Reply-To: <83eedo4k3j.fsf@gnu.org> X-Mailman-Approved-At: Sun, 30 May 2021 12:05:44 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:207632 Archived-At: Eli Zaretskii writes: >> From: Imran Khan >> Date: Sun, 30 May 2021 15:32:16 +0600 >> >> > Since you use insert-file-contents-literally, why don't you also make >> > the temporary buffer unibyte? That is: >> > >> > (benchmark-run 1 >> > (let ((str)) >> > (with-temp-buffer >> > (set-buffer-multibyte nil) ; <<<<<<<<<<<<<<<<<<<<<<<<<<<<< >> > (insert-file-contents-literally "/tmp/test") >> > (setq str (buffer-string))) >> > (string-width str))) >> > >> > Or maybe I don't understand your real-life use case? Because if you >> > treat the file as a raw bytestream, why do you need to compute the >> > width of its text? >> >> I would agree, my example was pointlessly contrived. For what it's >> worth, `insert-file-contents` exhibits same poor performance, and that's >> used by code in the wild (e.g. deft-mode, though I am sceptical if they >> should be needing to call `string-width` on entire buffer text either). >> >> Personally I am now going to use your `(set-buffer-multibyte nil)` >> suggestion to patch their code for myself (so thanks for this). Since >> I have no idea about the internal complexity of `string-width` or what >> should be justified performance expectation, I would let you decide if >> this is a bug or not. > > I'm not yet sure whether this is a real problem, because I don't > really understand the relation between your example code and what you > really need to do in deft-mode. Specifically, generating random > characters isn't something that usually happens in real life. > > So could you perhaps explain what you are using string-width for in > deft-mode, and what kind of text are you measuring there in your > real-life situations? > > Thanks. > > P.S. Please use Reply All to keep the bug address on the CC list. Basically deft-mode takes a folder full of normal org-mode files, and constructs a pretty "dashboard" view of the folder where you can browse, search, filter the files in said folder (and many other features). The dashboard UI uses `string-width` to calculate how much space is to be allocated relative to window width to display metadata like file title, file content summary, mtime etc for each file per line. This is dynamic, the components size adjusts to window width change. Perhaps the screenshot they have in their project page would be more descriptive: https://github.com/jrblevin/deft Org-mode files typically have unicode chars in them. So when deft-mode uses `string-width` to construct view of file content part, it hangs. I think the performance problem here is exacerbated because deft-mode is stripping all vertical whitespace to squash the content to show in a single line summary view, before calling `string-width`. P.S. I am not involved with deft-mode, merely a user with moderately sized utf-8 org-mode files. But if you think it's their UI implementation ought be reworked, I can forward it to them.