From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#41005: problem with rendering Persian text in Emacs 27 Date: Sat, 06 Jun 2020 12:04:04 +0300 Message-ID: <831rmsa7ln.fsf@gnu.org> References: <831rmwc9ke.fsf@gnu.org> <35A46479-A62C-42FF-995B-B295FE3408C0@gnu.org> <08A9D65F-0C9C-4EE2-B3B9-2AA25BFFAD54@gnu.org> <878sh35j6f.fsf@gmail.com> <83y2p3as6c.fsf@gnu.org> <87pnae4nhx.fsf@gmail.com> <83lfl2av8z.fsf@gnu.org> <834krpbrnu.fsf@gnu.org> <87a71h1x3s.fsf@gmail.com> <831rmtbny0.fsf@gnu.org> <87d06czj00.fsf@gmail.com> Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="94033"; mail-complaints-to="usenet@ciao.gmane.io" Cc: valizadeh.ho@gmail.com, 41005@debbugs.gnu.org, nicholasdrozd@gmail.com To: Pip Cet Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Jun 06 11:05:10 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jhUlJ-000OKt-Kc for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 06 Jun 2020 11:05:09 +0200 Original-Received: from localhost ([::1]:33462 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jhUlI-0002dP-I7 for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 06 Jun 2020 05:05:08 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:46826) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jhUlC-0002d7-FJ for bug-gnu-emacs@gnu.org; Sat, 06 Jun 2020 05:05:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:39144) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jhUlC-0007qc-5s for bug-gnu-emacs@gnu.org; Sat, 06 Jun 2020 05:05:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jhUlC-0007PN-1M for bug-gnu-emacs@gnu.org; Sat, 06 Jun 2020 05:05:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 06 Jun 2020 09:05:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 41005 X-GNU-PR-Package: emacs Original-Received: via spool by 41005-submit@debbugs.gnu.org id=B41005.159143426128425 (code B ref 41005); Sat, 06 Jun 2020 09:05:01 +0000 Original-Received: (at 41005) by debbugs.gnu.org; 6 Jun 2020 09:04:21 +0000 Original-Received: from localhost ([127.0.0.1]:50690 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jhUkW-0007OP-Sq for submit@debbugs.gnu.org; Sat, 06 Jun 2020 05:04:21 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:33926) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jhUkV-0007OB-RW for 41005@debbugs.gnu.org; Sat, 06 Jun 2020 05:04:20 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:58191) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jhUkQ-0007nQ-JX; Sat, 06 Jun 2020 05:04:14 -0400 Original-Received: from [176.228.60.248] (port=1860 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jhUkP-0000nR-TP; Sat, 06 Jun 2020 05:04:14 -0400 In-Reply-To: <87d06czj00.fsf@gmail.com> (message from Pip Cet on Sat, 06 Jun 2020 08:38:39 +0000) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:181599 Archived-At: > From: Pip Cet > Cc: valizadeh.ho@gmail.com, 41005@debbugs.gnu.org, nicholasdrozd@gmail.com > Date: Sat, 06 Jun 2020 08:38:39 +0000 > > >> Given these two bugs, I wonder whether it wouldn't be more reasonable > >> always to let HarfBuzz guess the direction, at least for Emacs-27: > >> scripts which change direction, if they are supported by HarfBuzz, won't > >> work anyway. > > > > Please explain "scripts that change direction" and "won't work > > anyway", I don't think I understand that part. > > I think your example (RLO..PDF in RTL text) is better: that won't work > anyway, right now, because if, for example, you type > > f i > > and have set the char table to treat "fi" as a ligature, the result will > (at least sometimes) be an "fi" ligature, but it should look like the > word "if". That's not how shaping engines work, at least not how HarfBuzz does AFAIU. It gets the characters in the logical order, so it always wants to see "fi", even if the directionality of the characters was overridden, and it also wants to know the local text directionality. What is produced from that depends on the font: if it has different ligatures for "fi" in different directions, then HarfBuzz should give us back the ligature appropriate for the direction it was passed. (Personally, I think that when some text uses a directional override, they don't intend to see ligatures, because the override is mostly for treating characters as independent of the surrounding context. But this is eventually up to the font to specify. AFAIU, Arabic shaping works differently in different directional contexts, for example.) > > The reason we don't let HarfBuzz guess in all cases is because the > > resolved bidi level, when we have it, is a more accurate indication of > > the required direction. > > Yes, but we'll still cache the wrong direction. Why "wrong"? We will cache the same direction as we passed to HarfBuzz, and thus the produced glyphs will be consistent with the cached direction. And if we ever need to display the same sequence of characters with a different direction, the cached sequence will fail to match, and we will call HarfBuzz again to produce glyphs for this other direction. That sounds TRT to me. > If we let HarfBuzz guess in all cases, output will be consistent and > usually correct We want the direction to be _always_ correct, not just "usually". The shapers we used before HarfBuzz didn't allow to pass the direction, they always guessed it. HarfBuzz lets us specify the direction, which is progress, since Emacs now has better control on the glyphs that are produced, and HarfBuzz developers tell us the difference sometimes matters. > > For example, if you have RTL characters > > inside the LRO..PDF embedding, it would be wrong to let the shaper > > guess, because it could (and usually will) guess wrongly that the > > direction is R2L. It is true that these are rare and unusual use > > cases, but they do exist, and Emacs does want to support them, > > including with scripts that must use the shaping engine. > > As I described, I don't think RLO..PDF works with shaping right now, > because other code might have already cached the non-overridden glyph > string. I was saying that under the assumption that the direction will be cached. You are right that currently this doesn't work correctly, but that's exactly why we agreed to cache the direction with the other composition information. Once the caching of direction is implemented, my point is that passing the direction to HarfBuzz and caching it will produce better results for text in a directional override than if we let HarfBuzz guess the direction.