From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#70000: 29.2; Grapheme handling incorrect Date: Wed, 27 Mar 2024 19:17:39 +0200 Message-ID: <865xx7iogc.fsf@gnu.org> References: <878r26duar.fsf@vps.thesusis.net> <86cyrije9v.fsf@gnu.org> <875xx7epd9.fsf@vps.thesusis.net> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="7525"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 70000@debbugs.gnu.org To: Phillip Susi Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Mar 27 18:18:37 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rpWv6-0001n9-V9 for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 27 Mar 2024 18:18:37 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rpWua-0008I8-TI; Wed, 27 Mar 2024 13:18:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rpWuY-0008Hv-R4 for bug-gnu-emacs@gnu.org; Wed, 27 Mar 2024 13:18:03 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rpWuY-0006Kl-GI for bug-gnu-emacs@gnu.org; Wed, 27 Mar 2024 13:18:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rpWuY-0002vr-DA for bug-gnu-emacs@gnu.org; Wed, 27 Mar 2024 13:18:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 27 Mar 2024 17:18:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 70000 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: notabug Original-Received: via spool by 70000-submit@debbugs.gnu.org id=B70000.171155987111216 (code B ref 70000); Wed, 27 Mar 2024 17:18:02 +0000 Original-Received: (at 70000) by debbugs.gnu.org; 27 Mar 2024 17:17:51 +0000 Original-Received: from localhost ([127.0.0.1]:38266 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rpWuN-0002uq-0i for submit@debbugs.gnu.org; Wed, 27 Mar 2024 13:17:51 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39636) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rpWuK-0002uV-Ok for 70000@debbugs.gnu.org; Wed, 27 Mar 2024 13:17:50 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rpWuE-0006GL-P3; Wed, 27 Mar 2024 13:17:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=czUy6b4rILf69LK5fkqAD/XoDe3/x8HFuf/MK7I6vSU=; b=PnojMGwViFnI L2MRgMjqmaWirFKNAJIpJPH0nWnbykLN/iW+Aqi1k3CNr08//X6tYvxEj814r+kzV94cu6DslQOhh d7sXjxqf+JwVMJAzswjEjU82w26gMsodSUupcPyw4X7PmGyeex6cJCqr8KLevd3Bvf0m46tPedUUx egVjXX4JsunyukrptaMxP4onipdf2oxrE5oqqpJW/hPylNCaTVu7R9KkVGjeMqljyAD1mRM/lKkJL PY1FrSyMqaX4+g2WewTZqpcKofaWbUE3GNUgVMc5n15jAulZyvXBjo3yYYZTrmWnkx7ptI9G2AaZl gbl+33F76Jgl0ygS3uDQRQ==; In-Reply-To: <875xx7epd9.fsf@vps.thesusis.net> (message from Phillip Susi on Wed, 27 Mar 2024 10:11:30 -0400) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:282148 Archived-At: > From: Phillip Susi > Cc: 70000@debbugs.gnu.org > Date: Wed, 27 Mar 2024 10:11:30 -0400 > > Eli Zaretskii writes: > > > Querying the cursor position won't help in this case because it is > > Emacs that moves the cursor when you type C-f, not the terminal. > > I'm not talking about C-f, but simply displaying the characters on the > screen. Emacs assumes the width is 4 when it prints this character, and > so it thinks that the cursor moved over 4 places. When the terminal > actually only moves the cursor over 2 spaces, emacs gets out of sync > with the terminal, and massive breakage occurs. I understand what you are saying, but this is not how Emacs display code works. It needs to know the width of every character displayed on the screen, and it needs to be able to determine that even without actually displaying the character. When Emacs is about to redraw some portion of the screen, it moves the cursor to that place. To be able to move the cursor there, it needs to be able to compute the coordinates on the screen of every character that is currently shown, so it can construct the command for the terminal driver to move cursor to that place. If Emacs were to rely on displaying characters for that, it would have needed to constantly redraw large portions of the screen, and that would both be much slower and cause unpleasant flickering of the display, due to redrawing of screen portions that don't actually change. So this technique is out of the question for Emacs. > By reading back the cursor position from the terminal after displaying a > grapheme cluster, it would learn how the terminal displayed it and > update its idea of where the cursor is correctly. I understand. But Emacs needs this information also long after the characters were already drawn. For example, imagine that Emacs displays these characters on the screen, and then leaves most of the screen intact and periodically redraws some small portion of the screen, like updating current time in the lower-right corner of the screen when Emacs is otherwise idle. To do that, Emacs needs to move the cursor from its current position somewhere on the screen to the lower-right corner, redraw the time there, then move the cursor back to where it was. These cursor moves are based on the ability to calculate the geometry of each character on display without actually writing the characters to the screen. In addition, if Emacs had to query the cursor position after each written character, its redisplay would be much slower than it is now. > I originally ran into this problem not with a ZWJ, but with an emoji > followed by alternate selector 16 that someone used in a subject line of > an email, and when browsing my inbox with notmuch, the terminal went > FUBAR. Yes, that's a known issue with some of the terminal emulators that compose Emoji and other similar character sequences into grapheme clusters, while ignoring the width that is expected from the result. I'm not aware of any good solution, unfortunately. Sometimes, disabling auto-composition-mode helps, but even that cannot solve all the problems, especially when each of the characters composed by the terminal into a single grapheme cluster has non-zero width according to the Unicode tables. (If only the first character in the composed sequence has non-zero width and the rest are zero-width, disabling auto-composition-mode might produce a correct display.) The bottom line is what I said at the beginning: we need some protocol by which a terminal emulator could be queried about whether it supports character composition, and if so, what is the screen width of a given sequence of codepoints that will be composed, without actually displaying them. Better yet, some standard table of such widths could be accepted by complying terminal emulators, and then Emacs could use such a table to know the width in advance (similarly to how it knows that from the Unicode data files). Until such protocols or tables exist, Emacs will be unable to produce correct display on these terminal emulators.