From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n) Date: Sun, 27 Jan 2019 19:12:04 +0200 Message-ID: <83h8du3tzv.fsf@gnu.org> References: <20181222154945.GE2244@macbook.localdomain> <83bm5d9wsc.fsf@gnu.org> <20181222205948.GF2244@macbook.localdomain> <838t0gapcj.fsf@gnu.org> <20181223135109.GA6568@macbook.localdomain> <83va3k8c79.fsf@gnu.org> <20181224020847.GC6568@macbook.localdomain> <83lg4e9a7q.fsf@gnu.org> <20181224173723.GH6568@macbook.localdomain> <83imzi94tz.fsf@gnu.org> <20190105211514.GB28761@macbook.localdomain> <83wonhzsb8.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="11460"; mail-complaints-to="usenet@blaine.gmane.org" Cc: behdad@behdad.org, 33729@debbugs.gnu.org, far.nasiri.m@gmail.com To: dr.khaled.hosny@gmail.com Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Jan 27 18:13:12 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gnnzb-0002qE-3M for geb-bug-gnu-emacs@m.gmane.org; Sun, 27 Jan 2019 18:13:11 +0100 Original-Received: from localhost ([127.0.0.1]:48302 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gnnza-0006Ba-3x for geb-bug-gnu-emacs@m.gmane.org; Sun, 27 Jan 2019 12:13:10 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:47591) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gnnzT-0006AR-6v for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 12:13:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gnnzS-0001v5-5g for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 12:13:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:48555) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gnnzR-0001uw-Vz for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 12:13:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gnnzR-0001uV-Mu for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 12:13:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 27 Jan 2019 17:13:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33729 X-GNU-PR-Package: emacs Original-Received: via spool by 33729-submit@debbugs.gnu.org id=B33729.15486091647315 (code B ref 33729); Sun, 27 Jan 2019 17:13:01 +0000 Original-Received: (at 33729) by debbugs.gnu.org; 27 Jan 2019 17:12:44 +0000 Original-Received: from localhost ([127.0.0.1]:47836 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnnz9-0001tu-Hq for submit@debbugs.gnu.org; Sun, 27 Jan 2019 12:12:43 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:37195) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnnz4-0001te-GT for 33729@debbugs.gnu.org; Sun, 27 Jan 2019 12:12:39 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:36644) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gnnyp-0001Q9-Gy; Sun, 27 Jan 2019 12:12:27 -0500 Original-Received: from [176.228.60.248] (port=1367 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gnnyn-0000We-QA; Sun, 27 Jan 2019 12:12:23 -0500 In-reply-to: <83wonhzsb8.fsf@gnu.org> (message from Eli Zaretskii on Sun, 06 Jan 2019 18:03:55 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:154818 Archived-At: Could you please respond to the below as well? > Date: Sun, 06 Jan 2019 18:03:55 +0200 > From: Eli Zaretskii > Cc: behdad@behdad.org, far.nasiri.m@gmail.com, 33729@debbugs.gnu.org > > > Date: Sat, 5 Jan 2019 23:15:14 +0200 > > From: Khaled Hosny > > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org, > > 33729@debbugs.gnu.org, kaushal.modi@gmail.com > > > > > > The built-in HarfBuzz code is for getting the script for a given > > > > character, but resolving characters with Common script is left to the > > > > client. Suppose you have this string (upper case for RTL) ABC 123 DEF, > > > > what HarfBuzz sees during shaping is three separate chunks of text ABC, > > > > 123, DEF. The 123 part is all Common script characters and thus > > > > hb_buffer_guess_segment_properties won’t be able to guess anything (and > > > > based on the font and the script, this can cause rendering differences). > > > > Emacs will have to resolve the script of Common characters before > > > > applying bidi algorithm and pass that down to HarfBuzz. > > > > > > I'm not sure I understand: why does HarfBuzz care that 123 was in the > > > middle if RTL text. > > > > It doesn’t. What it cares about here is the correct script. Because 123 > > are in the middle of RTL text they will be shaped separately, and thus > > hb_buffer_guess_segment_properties() will only see 123 and won’t to be > > able to guess the correct script for them (Arabic, Hebrew, etc., > > whatever the script for the surrounding RTL text is). > > That's what I was asking: why it's important for HarfBuzz to know that > 123 should be shaped for the Arabic script? > > > Depending on the font, the digits might be shaped differently if the > > script is, say Arabic, by e.g. applying script-specific substitutions to > > forms more suitable for a given script. > > I guess this is what I'm missing, then: these script-specific > substitutions. Can you elaborate on that, or point to some place > where these substitutions are described in detail? > > > > (In general, AFAIK simple characters like 123 will not even go through > > > HarfBuzz, as Emacs doesn't call the shaper for characters whose entry > > > in composition-function-table is nil. So I guess 123 here should > > > stand for some other characters, not for literal digits? IOW, I don't > > > think I understand the example very well.) > > > > This is a bug then and needs to be fixed. All text should go through > > HarfBuzz since even so-called “simple” character often require shaping > > depending on the text and the font. If this is done for optimization, > > then it should be revised to see if shaping with HarfBuzz is actually > > significantly slower and if it is, find more proper ways to optimize it. > > (Adding Handa-san to the discussion, in the hope that he could comment > on the issue.) > > I think running all text through a shaper might be prohibitively > expensive, because the shaper is called through Lisp code (see > composite.el), and we decide which chunk of text to pass to the shaper > using regexp search. See the various files under lisp/language/ which > set up portions of composition-function-table as appropriate for each > language that needs it. > > So I think we should identify all the cases where "simple" characters > surrounded by, or adjacent to, "non-simple" ones need to be passed to > a shaper, and add the necessary regular expressions to the data > structures in lisp/languages/. Can you describe these cases, or point > me to a place where I can find the relevant info? > > Thanks. > > > >