From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ruijie Yu via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales Date: Sun, 23 Apr 2023 18:23:02 +0800 Message-ID: Reply-To: Ruijie Yu Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26669"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.9.22; emacs 30.0.50 To: 63029@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Apr 23 12:39:27 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pqX7v-0006ox-G9 for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 23 Apr 2023 12:39:27 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pqX7h-0008WV-P2; Sun, 23 Apr 2023 06:39:13 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqX7Y-0008WF-Eb for bug-gnu-emacs@gnu.org; Sun, 23 Apr 2023 06:39:05 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pqX7W-0007Iw-UH for bug-gnu-emacs@gnu.org; Sun, 23 Apr 2023 06:39:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pqX7W-0000dW-ER for bug-gnu-emacs@gnu.org; Sun, 23 Apr 2023 06:39:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Ruijie Yu Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Apr 2023 10:39:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 63029 X-GNU-PR-Package: emacs X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.16822463232417 (code B ref -1); Sun, 23 Apr 2023 10:39:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 23 Apr 2023 10:38:43 +0000 Original-Received: from localhost ([127.0.0.1]:44648 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqX7D-0000cu-1P for submit@debbugs.gnu.org; Sun, 23 Apr 2023 06:38:43 -0400 Original-Received: from lists.gnu.org ([209.51.188.17]:41986) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqX78-0000cd-7a for submit@debbugs.gnu.org; Sun, 23 Apr 2023 06:38:41 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqX6x-0008Tp-5x for bug-gnu-emacs@gnu.org; Sun, 23 Apr 2023 06:38:29 -0400 Original-Received: from netyu.xyz ([152.44.41.246] helo=mail.netyu.xyz) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqX6s-0007As-CM for bug-gnu-emacs@gnu.org; Sun, 23 Apr 2023 06:38:26 -0400 Original-Received: from fw.net.yu.netyu.xyz ( [222.248.4.98]) by netyu.xyz (OpenSMTPD) with ESMTPSA id 6a1f0e39 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Sun, 23 Apr 2023 10:38:16 +0000 (UTC) Received-SPF: pass client-ip=152.44.41.246; envelope-from=ruijie@netyu.xyz; helo=mail.netyu.xyz X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:260502 Archived-At: Hello, I don't quite know yet whether this is a bug in Emacs. Here are the observed results, and note the unicode character: --8<---------------cut here---------------start------------->8--- $ for locale in {en_US,fr_FR,de_DE,zh_CN,ja_JA}.UTF-8; do printf "$locale\t" LANG=3D"$locale" src/emacs -Q -batch \ -eval '(message "%S" (format "%-5.5s" "1234=E2=80=A6"))' done --8<---------------cut here---------------end--------------->8--- This results in the following output: --8<---------------cut here---------------start------------->8--- en_US.UTF-8 "1234=E2=80=A6" fr_FR.UTF-8 "1234=E2=80=A6" de_DE.UTF-8 "1234=E2=80=A6" zh_CN.UTF-8 "1234 " ja_JA.UTF-8 "1234 " --8<---------------cut here---------------end--------------->8--- Notice that in zh_CN and ja_JA, we have a space instead of the expected ellipsis character. If this is expected behavior, how do we know how "wide" the `format' function thinks any given character is? In other words, why _does_ it think "=E2=80=A6" should be two-character wide? And how do we, the elisp u= sers, get this information? I tried to dive into the C code for `styled_format', but got lost. Thanks. ---------- Reproduced on this in-source build: In GNU Emacs 30.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.37, cairo version 1.17.8) of 2023-04-23 built on fw.net.yu Repository revision: 3badd2358d5f0af71887ee1cc9d39c2f312b6888 Repository branch: master System Description: Arch Linux Configured using: 'configure --sysconfdir=3D/etc --prefix=3D/usr --localstatedir=3D/var --with-cairo --with-harfbuzz --with-libsystemd --with-modules --with-pgtk --with-native-compilation CFLAGS=3D-Og' Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON LCMS2 LIBOTF LIBSYSTEMD LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PGTK PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP XIM GTK3 ZLIB --=20 Best, RY [Please note that this mail might go to spam due to some misconfiguration in my mail server -- still investigating.]