From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Richard Wordingham Newsgroups: gmane.emacs.bugs Subject: bug#20173: 24.4; Rendering misallocates combining marks on ligatures Date: Tue, 24 Mar 2015 08:28:28 +0000 Message-ID: <20150324082828.6bad0649@JRWUBU2> References: <20150323010626.530d3395@JRWUBU2> <83wq27raer.fsf@gnu.org> <20150323224107.4532b1cc@JRWUBU2> <837fu7qcx1.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1427185766 27379 80.91.229.3 (24 Mar 2015 08:29:26 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 24 Mar 2015 08:29:26 +0000 (UTC) Cc: 20173@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Mar 24 09:29:15 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YaKD4-0004zZ-Nk for geb-bug-gnu-emacs@m.gmane.org; Tue, 24 Mar 2015 09:29:14 +0100 Original-Received: from localhost ([::1]:59610 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YaKD2-0005Qc-F7 for geb-bug-gnu-emacs@m.gmane.org; Tue, 24 Mar 2015 04:29:12 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:38844) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YaKCy-0005QX-Jx for bug-gnu-emacs@gnu.org; Tue, 24 Mar 2015 04:29:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YaKCs-0007YL-MK for bug-gnu-emacs@gnu.org; Tue, 24 Mar 2015 04:29:08 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:44700) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YaKCs-0007YH-Iv for bug-gnu-emacs@gnu.org; Tue, 24 Mar 2015 04:29:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1YaKCs-0005Ef-4S for bug-gnu-emacs@gnu.org; Tue, 24 Mar 2015 04:29:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Richard Wordingham Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 24 Mar 2015 08:29:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 20173 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 20173-submit@debbugs.gnu.org id=B20173.142718572620104 (code B ref 20173); Tue, 24 Mar 2015 08:29:02 +0000 Original-Received: (at 20173) by debbugs.gnu.org; 24 Mar 2015 08:28:46 +0000 Original-Received: from localhost ([127.0.0.1]:34476 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YaKCb-0005EC-Mn for submit@debbugs.gnu.org; Tue, 24 Mar 2015 04:28:46 -0400 Original-Received: from know-smtprelay-omc-9.server.virginmedia.net ([80.0.253.73]:58910) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YaKCZ-0005Dy-Ir for 20173@debbugs.gnu.org; Tue, 24 Mar 2015 04:28:44 -0400 Original-Received: from JRWUBU2 ([81.103.224.4]) by know-smtprelay-9-imp with bizsmtp id 7LUd1q01K06JmVd01LUdSY; Tue, 24 Mar 2015 08:28:37 +0000 X-Originating-IP: [81.103.224.4] X-Spam: 0 X-Authority: v=2.1 cv=dJgomYpb c=1 sm=1 tr=0 a=pLuj3OkTrmEUIJBpyvkqVg==:117 a=pLuj3OkTrmEUIJBpyvkqVg==:17 a=kj9zAlcOel0A:10 a=NLZqzBF-AAAA:8 a=mDV3o1hIAAAA:8 a=zsJd4dzC6xolN8TzPtkA:9 a=CjuIK1q_8ugA:10 In-Reply-To: <837fu7qcx1.fsf@gnu.org> X-Mailer: Claws Mail 3.8.0 (GTK+ 2.24.10; i686-pc-linux-gnu) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:100876 Archived-At: On Tue, 24 Mar 2015 05:42:18 +0200 Eli Zaretskii wrote: > If the setting of composition > rules for Arabic is not the culprit, then what is? AFAIK, there are > no rules that guide Emacs's shaping except what's in > composition-function-table. Beyond that, the only other factor is the > font backend and how it shapes glyphs given the chunks of text Emacs > presents to it. The font backend on Unixy systems consists of three components - m17n (shaping control), libotf (OTL look-up implementation) and Freetype (glyph rendering). The glue between them is in Emacs, most relevantly in function ftfont_drive_otf() in ftfont.c. My analysis of the problem, which could quite easily be wrong, is as follows. To control the positioning of marks for the mark2ligature lookup, it is necessary to record in some fashion which component of the ligature a mark applies to. I cannot see this information being stored. The information should be generated and used by libotf, but needs to be stored between callbacks of ftfont_drive_otf() by m17n. (The initial settings are implicit in the sequence of codepoints.) Storing this information would, so far as I can see, require a change to ftfont_drive_otf(). I may be able to change my font to work round this bug; I can certainly change it to hide the symptom I observed. The solution will be to categorise the ligature NAA as a base glyph rather than as a ligature glyph. There are other places where the HarfBuzz rendering system, which aims to be compatible with Windows, uses this information. In particular, marks applied to a ligature are only allowed to ligate if they apply to the same component of a ligature, and mark2mark positioning only applies if the two marks apply to the same component. This logic is described as 'the most tricky part of the OpenType specification'. Part of the trickiness may be that it seems not to have been published externally (possibly not even internally) by Microsoft. The guiding principle seems to be that one should do the right things to the marks on a ligature of Arabic consonants. I have become well-acquainted with this logic because the 'same component logic' seems to be applied by HarfBuzz regardless of whether the marks are preceded by a base glyph or a ligature glyph. The Windows logic seems similar, but is subtly different. I hit problems with the Tai Tham NAA ligature, because the marks above on its two components do interact. The marks below should probably also interact, but combinations where I would expect them to have to interact seem not to occur in natural text. > > As to what needs fixing in the Arabic section of misc-lang.el: > Thanks, I will look into these. You might want to first check whether composed Arabic is usable. Doesn't making each word a grapheme cluster makes editing unpleasant? It might be worth restricting the clustering to cursively connected sequences of letters within a word. Richard.