unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Štěpán Němec" <stepnem@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: emacs-devel@gnu.org
Subject: Re: master 188bd80: gnus-shorten-url: Improve and avoid args-out-of-range error
Date: Tue, 14 Apr 2020 11:26:22 +0200	[thread overview]
Message-ID: <87k12ia05d.fsf@gmail.com> (raw)
In-Reply-To: <jwvv9m3nxfa.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "Mon, 13 Apr 2020 12:51:58 -0400")

[-- Attachment #1: Type: text/plain, Size: 2122 bytes --]

On Mon, 13 Apr 2020 12:51:58 -0400
Stefan Monnier wrote:

>> +;;;###autoload
>> +(defun string-truncate-left (string length)
>> +  "Truncate STRING to LENGTH, replacing initial surplus with \"...\"."
>> +  (let ((strlen (length string)))
>> +    (if (<= strlen length)
>> +	string
>> +      (setq length (max 0 (- length 3)))
>> +      (concat "..." (substring string (max 0 (- strlen 1 length)))))))
>
> This should of course rely on string-width rather than string-length,
> but more importantly, it should obey `truncate-string-ellipsis` and
> it should be "closer" to `truncate-string-to-width` (they should likely
> be in the same file, and with similar sounding names).
> Maybe it should even be merged with `truncate-string-to-width`.

As the commit message says, that's really just a renamed helper function
originally used by ediff for file names (and now also in
`gnus-shorten-url').

Rewriting it to use `string-width' will require adjusting the callers,
too, but that's probably a good thing, as it should lead to more
correct results with strings containing wide characters. Still not
necessarily really correct results, though, as AFAICT the "columns"
which `string-width' speaks about are just an approximation, depending
on the fonts used.

E.g. with the default Chinese font emacs -Q uses on my system, I get
roughly 8.5 "columns" per 5 Chinese characters, not 10 as claimed by
`string-width'.

I also don't understand why the result of `string-width' should depend
on `current-language-environment', e.g. with "Chinese-GBK",
(string-width "…") returns 2 (why?!), with "English" or "UTF-8" it
returns 1, even though the display (font, "columns") stays the same for
all of them.

As for possible merging with `truncate-string-to-width', I don't think
I'm up to it; I was struggling to understand its doc string, let alone
the implementation.

Here's what I was able to come up with (BTW, I have little experience
with RTL scripts, but, doesn't in that case the ellipsis end up on the
logically wrong side, i.e. with the beginning/end of string reversed?):


[-- Attachment #2: 0001-string-truncate-left-Use-string-width-and-truncate-s.patch --]
[-- Type: text/x-patch, Size: 9177 bytes --]

From ad95727d0858767f14b27f412b12281a1a279870 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C5=A0t=C4=9Bp=C3=A1n=20N=C4=9Bmec?= <stepnem@gmail.com>
Date: Tue, 14 Apr 2020 11:08:50 +0200
Subject: [PATCH] string-truncate-left: Use string-width and
 truncate-string-ellipsis

https://lists.gnu.org/archive/html/emacs-devel/2020-04/msg00734.html
<jwvv9m3nxfa.fsf-monnier+emacs@gnu.org>

* lisp/emacs-lisp/subr-x.el (string-truncate-left): Rename and move...
* lisp/international/mule-util.el (truncate-string-left): ...here.
Use 'string-width' instead of 'string-length', respect
'truncate-string-ellipsis'.  All callers changed.
* lisp/gnus/gnus-sum.el (gnus-shorten-url): Use 'string-width'.
* test/lisp/international/mule-util-tests.el (truncate-string-left):
New test.
---
 lisp/emacs-lisp/subr-x.el                  |  9 ----
 lisp/gnus/gnus-sum.el                      |  7 ++--
 lisp/international/mule-util.el            | 13 ++++++
 lisp/vc/ediff-mult.el                      | 14 +++----
 test/lisp/international/mule-util-tests.el | 49 +++++++++++++++++++++-
 5 files changed, 71 insertions(+), 21 deletions(-)

diff --git a/lisp/emacs-lisp/subr-x.el b/lisp/emacs-lisp/subr-x.el
index 9f96ac50d1..044c9aada0 100644
--- a/lisp/emacs-lisp/subr-x.el
+++ b/lisp/emacs-lisp/subr-x.el
@@ -236,15 +236,6 @@ string-trim
 TRIM-LEFT and TRIM-RIGHT default to \"[ \\t\\n\\r]+\"."
   (string-trim-left (string-trim-right string trim-right) trim-left))
 
-;;;###autoload
-(defun string-truncate-left (string length)
-  "Truncate STRING to LENGTH, replacing initial surplus with \"...\"."
-  (let ((strlen (length string)))
-    (if (<= strlen length)
-	string
-      (setq length (max 0 (- length 3)))
-      (concat "..." (substring string (max 0 (- strlen 1 length)))))))
-
 (defsubst string-blank-p (string)
   "Check whether STRING is either empty or only whitespace.
 The following characters count as whitespace here: space, tab, newline and
diff --git a/lisp/gnus/gnus-sum.el b/lisp/gnus/gnus-sum.el
index 6f367692dd..2aa4e483c0 100644
--- a/lisp/gnus/gnus-sum.el
+++ b/lisp/gnus/gnus-sum.el
@@ -9494,15 +9494,16 @@ gnus-collect-urls
     (delete-dups urls)))
 
 (defun gnus-shorten-url (url max)
-  "Return an excerpt from URL not exceeding MAX characters."
-  (if (<= (length url) max)
+  "Return an excerpt from URL not exceeding MAX \"columns\".
+For the meaning of \"column\" see `truncate-string-to-width'."
+  (if (<= (string-width url) max)
       url
     (let* ((parsed (url-generic-parse-url url))
            (host (url-host parsed))
            (rest (concat (url-filename parsed)
                          (when-let ((target (url-target parsed)))
                            (concat "#" target)))))
-      (concat host (string-truncate-left rest (- max (length host)))))))
+      (concat host (truncate-string-left rest (- max (string-width host)))))))
 
 (defun gnus-summary-browse-url (&optional external)
   "Scan the current article body for links, and offer to browse them.
diff --git a/lisp/international/mule-util.el b/lisp/international/mule-util.el
index caa5747817..693601ea45 100644
--- a/lisp/international/mule-util.el
+++ b/lisp/international/mule-util.el
@@ -129,6 +129,19 @@ truncate-string-to-width
         (concat head-padding (substring str from-idx idx)
 	        tail-padding ellipsis)))))
 
+;;;###autoload
+(defun truncate-string-left (string width)
+  "Truncate STRING to WIDTH, replacing initial surplus with an ellipsis.
+The ellipsis used is the value of `truncate-string-ellipsis'."
+  (let ((strwidth (string-width string)))
+    (if (<= strwidth width)
+	string
+      (let ((ellipsis-width (string-width truncate-string-ellipsis)))
+        (if (>= ellipsis-width width)
+            (truncate-string-to-width string strwidth (- strwidth width))
+          (concat truncate-string-ellipsis
+                  (truncate-string-to-width
+                   string strwidth (+ (- strwidth width) ellipsis-width))))))))
 \f
 ;;; Nested alist handler.
 ;; Nested alist is alist whose elements are also nested alist.
diff --git a/lisp/vc/ediff-mult.el b/lisp/vc/ediff-mult.el
index 2b1b07927f..6a6a2da7b9 100644
--- a/lisp/vc/ediff-mult.el
+++ b/lisp/vc/ediff-mult.el
@@ -1171,7 +1171,7 @@ ediff-meta-insert-file-info1
 	  ;; abbreviate the file name, if file exists
 	  (if (and (not (stringp fname)) (< file-size -1))
 	      "-------"		; file doesn't exist
-	    (string-truncate-left
+	    (truncate-string-left
 	     (ediff-abbreviate-file-name fname)
 	     max-filename-width)))))))
 
@@ -1265,12 +1265,12 @@ ediff-draw-dir-diffs
 	(if (= (mod membership-code ediff-membership-code1) 0) ; dir1
 	    (let ((beg (point)))
 	      (insert (format "%-27s"
-			      (string-truncate-left
+			      (truncate-string-left
 			       (ediff-abbreviate-file-name
 				(if (file-directory-p (concat dir1 file))
 				    (file-name-as-directory file)
 				  file))
-			       24)))
+			       27)))
 	      ;; format of meta info in the dir-diff-buffer:
 	      ;;    (filename-tail filename-full otherdir1 otherdir2 otherdir3)
 	      (ediff-set-meta-overlay
@@ -1280,12 +1280,12 @@ ediff-draw-dir-diffs
 	(if (= (mod membership-code ediff-membership-code2) 0) ; dir2
 	    (let ((beg (point)))
 	      (insert (format "%-26s"
-			      (string-truncate-left
+			      (truncate-string-left
 			       (ediff-abbreviate-file-name
 				(if (file-directory-p (concat dir2 file))
 				    (file-name-as-directory file)
 				  file))
-			       24)))
+			       26)))
 	      (ediff-set-meta-overlay
 	       beg (point)
 	       (list meta-buf file (concat dir2 file) dir1 dir2 dir3)))
@@ -1294,12 +1294,12 @@ ediff-draw-dir-diffs
 	    (if (= (mod membership-code ediff-membership-code3) 0) ; dir3
 		(let ((beg (point)))
 		  (insert (format " %-25s"
-				  (string-truncate-left
+				  (truncate-string-left
 				   (ediff-abbreviate-file-name
 				    (if (file-directory-p (concat dir3 file))
 					(file-name-as-directory file)
 				      file))
-				   24)))
+				   25)))
 		  (ediff-set-meta-overlay
 		   beg (point)
 		   (list meta-buf file (concat dir3 file) dir1 dir2 dir3)))
diff --git a/test/lisp/international/mule-util-tests.el b/test/lisp/international/mule-util-tests.el
index c571782d63..403b355bb6 100644
--- a/test/lisp/international/mule-util-tests.el
+++ b/test/lisp/international/mule-util-tests.el
@@ -1,4 +1,4 @@
-;;; mule-util --- tests for international/mule-util.el
+;;; mule-util-tests --- tests for international/mule-util.el
 
 ;; Copyright (C) 2002-2020 Free Software Foundation, Inc.
 
@@ -81,4 +81,49 @@ mule-util-test-truncate-create
 (dotimes (i (length mule-util-test-truncate-data))
   (mule-util-test-truncate-create i))
 
-;;; mule-util.el ends here
+(ert-deftest truncate-string-left ()
+  (let ((truncate-string-ellipsis "..."))
+    (should (equal (truncate-string-left "ahojky jojky mojky" 10)
+                   "...y mojky"))
+    (should (equal (truncate-string-left "jojky mojky" 10)
+                   "...y mojky"))
+    (should (equal (truncate-string-left "jojky" 10)
+                   "jojky"))
+    (should (equal (truncate-string-left "jojky" 3)
+                   "jky"))
+    (should (equal (truncate-string-left "我的老田野" 10)
+                   "我的老田野"))
+    (should (equal (truncate-string-left "碩鼠碩鼠,甭食我叔" 10)
+                   "...食我叔"))
+    (should (equal (truncate-string-left "碩鼠碩鼠,甭食我叔" 3)
+                   "叔"))
+    (should (equal (truncate-string-left "碩鼠碩鼠,jojky" 10)
+                   "...,jojky")))
+  (let ((truncate-string-ellipsis "......"))
+    (should (equal (truncate-string-left "ahojky jojky mojky" 10)
+                   "......ojky"))
+    (should (equal (truncate-string-left "jojky" 3)
+                   "jky"))
+    (should (equal (truncate-string-left "我的老田野" 10)
+                   "我的老田野"))
+    (should (equal (truncate-string-left "碩鼠碩鼠,甭食我叔" 10)
+                   "......我叔"))
+    (should (equal (truncate-string-left "碩鼠碩鼠,甭食我叔" 3)
+                   "叔"))
+    (should (equal (truncate-string-left "碩鼠碩鼠,jojky" 10)
+                   "......ojky")))
+  (let ((truncate-string-ellipsis "…"))
+    (should (equal (truncate-string-left "ahojky jojky mojky" 10)
+                   "…jky mojky"))
+    (should (equal (truncate-string-left "jojky" 3)
+                   "…ky"))
+    (should (equal (truncate-string-left "我的老田野" 10)
+                   "我的老田野"))
+    (should (equal (truncate-string-left "碩鼠碩鼠,甭食我叔" 10)
+                   "…甭食我叔"))
+    (should (equal (truncate-string-left "碩鼠碩鼠,甭食我叔" 3)
+                   "…叔"))
+    (should (equal (truncate-string-left "碩鼠碩鼠,jojky" 10)
+                   "…鼠,jojky"))))
+
+;;; mule-util-tests.el ends here
-- 
2.26.0


  reply	other threads:[~2020-04-14  9:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200413102415.23314.52412@vcs0.savannah.gnu.org>
     [not found] ` <20200413102417.445E520D0C@vcs0.savannah.gnu.org>
2020-04-13 16:51   ` master 188bd80: gnus-shorten-url: Improve and avoid args-out-of-range error Stefan Monnier
2020-04-14  9:26     ` Štěpán Němec [this message]
2020-04-14 11:55       ` Eli Zaretskii
2020-04-14 12:24         ` Štěpán Němec
2020-04-14 12:42           ` Eli Zaretskii
2020-04-14 13:48             ` Štěpán Němec
2020-04-14 17:51               ` Eli Zaretskii
2020-04-14 15:02             ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k12ia05d.fsf@gmail.com \
    --to=stepnem@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).