From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear Date: Sun, 19 Aug 2012 21:22:40 +0300 Message-ID: <83sjbid9n3.fsf@gnu.org> References: <87393kgbp4.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1345400580 11996 80.91.229.3 (19 Aug 2012 18:23:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 19 Aug 2012 18:23:00 +0000 (UTC) Cc: 11860@debbugs.gnu.org, smias@yandex.ru To: Kenichi Handa , Jason Rumney Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Aug 19 20:23:00 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1T3A9K-0000km-VL for geb-bug-gnu-emacs@m.gmane.org; Sun, 19 Aug 2012 20:22:59 +0200 Original-Received: from localhost ([::1]:39966 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T3A9J-0004yk-Kh for geb-bug-gnu-emacs@m.gmane.org; Sun, 19 Aug 2012 14:22:57 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:59886) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T3A9G-0004yf-OL for bug-gnu-emacs@gnu.org; Sun, 19 Aug 2012 14:22:56 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T3A9F-0005tu-ED for bug-gnu-emacs@gnu.org; Sun, 19 Aug 2012 14:22:54 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:57460) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T3A9F-0005tq-Ae for bug-gnu-emacs@gnu.org; Sun, 19 Aug 2012 14:22:53 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1T3A9N-000581-R6 for bug-gnu-emacs@gnu.org; Sun, 19 Aug 2012 14:23:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 19 Aug 2012 18:23:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 11860 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 11860-submit@debbugs.gnu.org id=B11860.134540057619703 (code B ref 11860); Sun, 19 Aug 2012 18:23:01 +0000 Original-Received: (at 11860) by debbugs.gnu.org; 19 Aug 2012 18:22:56 +0000 Original-Received: from localhost ([127.0.0.1]:38773 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1T3A9I-00057j-0B for submit@debbugs.gnu.org; Sun, 19 Aug 2012 14:22:56 -0400 Original-Received: from mtaout21.012.net.il ([80.179.55.169]:56852) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1T3A9D-00057Z-Vg for 11860@debbugs.gnu.org; Sun, 19 Aug 2012 14:22:54 -0400 Original-Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0M9000L00LJVLC00@a-mtaout21.012.net.il> for 11860@debbugs.gnu.org; Sun, 19 Aug 2012 21:22:41 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0M9000LSSLPSLE00@a-mtaout21.012.net.il>; Sun, 19 Aug 2012 21:22:41 +0300 (IDT) In-reply-to: <87393kgbp4.fsf@gnu.org> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:63296 Archived-At: > From: Kenichi Handa > Cc: eliz@gnu.org, 11860@debbugs.gnu.org, smias@yandex.ru > Date: Sat, 18 Aug 2012 11:45:27 +0900 > > So, apparently Emacs on Windows and GNU/Linux uses the > different metrics of glyphs. As the shaper on GNU/Linux > (m17n-lib library) works correctly for the same font, and > the other applications on Windows have no problem, I suspect > that the problem is in Emacs' interface with uniscribe > (w32font.c or w32uniscribe.c). I agree. > If this problem happens only for bidi scripts Can you suggest how to test this hypothesis? > one possibility is that Emacs's rendering engine (xdisp.c) expects > glyphs in a glyph-string are rendered in that order from left to > right, but the returned glyph-string on Windows should be rendered > in reverse order. You may be right, but it's hard to be sure. At least the advances[] array returned by ScriptPlace seems to point into that direction. Here's what I see in the debugger: Breakpoint 8, uniscribe_shape (lgstring=55041941) at w32uniscribe.c:373 373 LGLYPH_SET_CHAR (lglyph, chars[items[i].iCharPos (gdb) p items@nitems $1 = {0x35195a0} (gdb) p items[0]@nitems $2 = {{ iCharPos = 0, a = { eScript = 26, fRTL = 1, fLayoutRTL = 1, fLinkBefore = 0, fLinkAfter = 0, fLogicalOrder = 1, fNoGlyphIndex = 0, s = { uBidiLevel = 1, fOverrideDirection = 0, fInhibitSymSwap = 0, fCharShape = 0, fDigitSubstitute = 0, fInhibitLigate = 0, fDisplayZWG = 0, fArabicNumContext = 0, fGcpClusters = 0, fReserved = 0, fEngineReserved = 0 } } }} (gdb) p nitems $3 = 1 (gdb) p nglyphs $4 = 2 (gdb) p advances[0]@nglyphs $5 = {8, 0} (gdb) p offsets[0]@nglyphs $6 = {{ du = 0, dv = 0 }, { du = 1, dv = -2 }} (gdb) p chars[0]@2 $7 = L"\x639\x652" (Note that the fRTL member of items[0].a is set to TRUE.) My understanding of the advances[] array is that it gives, for each glyph in the cluster, the number of pixels to advance to the right after drawing the glyph. So the fact that it is 8 for the first (base) character and zero for the second one tells me that this grapheme cluster is supposed to be rendered in reverse order: first the Sukun, then Ayin at the same location, and then advance by 8 pixels for the next character. Is this correct? If it is correct, then how come the glyphs shown on GNU/Linux also have non-zero value of xadvance: [0 1 1593 969 8 2 8 4 4 nil] [0 1 1618 760 0 -6 -3 8 -11 [-9 2 0]] The value 8 after 969 comes directly from xadvance, as this code in ftfont.c shows: LGLYPH_SET_WIDTH (lglyph, g->xadv >> 6); Is the meaning of xadvance in libotf different from its meaning in Uniscribe? (And why is the glyph string element called WIDTH instead of ADVANCE?) If not, what am I missing? > For instance, in the above case, we may have to render glyphs in > this order (diacritical mark first): > > [0 1 1593 760 0 3 6 12 4 [1 -2 0]] > [0 1 1593 969 8 1 8 12 4 nil] I tried the naive patch below, but it didn't quite work. It seems like those changes somehow prevented character composition. Perhaps Handa-san could give me some guidance here. > I think the further debugging must be done by those who > knows uniscribe, w32font.c, and w32uniscribe.c. It's very hard, given that glyph-string documentation leaves a lot to be desired, and the way its various components are used during drawing is also left without clear documentation. E.g., this: FROM-IDX and TO-IDX are used internally and should not be touched. is not really helpful for explaining what are FROM-IDX and TO-IDX, so how can I figure out whether the code you asked about is doing TRT? And without knowing what is each component of glyph-string used for during drawing, how can I compare the values produced by Uniscribe APIs with what glyph-string needs? If someone could explain all those things, it would make debugging possible. Otherwise, I'm just randomly poking around... Here's the patch I tried: --- src/w32uniscribe.c~ 2012-07-08 07:24:56.000000000 +0300 +++ src/w32uniscribe.c 2012-08-19 15:55:17.323623900 +0300 @@ -331,17 +331,13 @@ uniscribe_shape (Lisp_Object lgstring) Lisp_Object lglyph = LGSTRING_GLYPH (lgstring, lglyph_index); ABC char_metric; unsigned gl; + int j1; if (NILP (lglyph)) { lglyph = Fmake_vector (make_number (LGLYPH_SIZE), Qnil); LGSTRING_SET_GLYPH (lgstring, lglyph_index, lglyph); } - /* Copy to a 32-bit data type to shut up the - compiler warning in LGLYPH_SET_CODE about - comparison being always false. */ - gl = glyphs[j]; - LGLYPH_SET_CODE (lglyph, gl); /* Detect clusters, for linking codes back to characters. */ @@ -365,6 +361,16 @@ uniscribe_shape (Lisp_Object lgstring) } } } + if (items[i].a.fRTL) + j1 = to - (j - from); + else + j1 = j; + + /* Copy to a 32-bit data type to shut up the + compiler warning in LGLYPH_SET_CODE about + comparison being always false. */ + gl = glyphs[j1]; + LGLYPH_SET_CODE (lglyph, gl); LGLYPH_SET_CHAR (lglyph, chars[items[i].iCharPos + from]); @@ -372,13 +378,13 @@ uniscribe_shape (Lisp_Object lgstring) LGLYPH_SET_TO (lglyph, items[i].iCharPos + to); /* Metrics. */ - LGLYPH_SET_WIDTH (lglyph, advances[j]); + LGLYPH_SET_WIDTH (lglyph, advances[j1]); LGLYPH_SET_ASCENT (lglyph, font->ascent); LGLYPH_SET_DESCENT (lglyph, font->descent); result = ScriptGetGlyphABCWidth (context, &(uniscribe_font->cache), - glyphs[j], &char_metric); + glyphs[j1], &char_metric); if (result == E_PENDING && !context) { /* Cache incomplete... */ @@ -387,7 +393,7 @@ uniscribe_shape (Lisp_Object lgstring) old_font = SelectObject (context, FONT_HANDLE (font)); result = ScriptGetGlyphABCWidth (context, &(uniscribe_font->cache), - glyphs[j], &char_metric); + glyphs[j1], &char_metric); } if (SUCCEEDED (result)) @@ -399,17 +405,17 @@ uniscribe_shape (Lisp_Object lgstring) else { LGLYPH_SET_LBEARING (lglyph, 0); - LGLYPH_SET_RBEARING (lglyph, advances[j]); + LGLYPH_SET_RBEARING (lglyph, advances[j1]); } - if (offsets[j].du || offsets[j].dv) + if (offsets[j1].du || offsets[j1].dv) { Lisp_Object vec; vec = Fmake_vector (make_number (3), Qnil); - ASET (vec, 0, make_number (offsets[j].du)); - ASET (vec, 1, make_number (offsets[j].dv)); + ASET (vec, 0, make_number (offsets[j1].du)); + ASET (vec, 1, make_number (offsets[j1].dv)); /* Based on what ftfont.c does... */ - ASET (vec, 2, make_number (advances[j])); + ASET (vec, 2, make_number (advances[j1])); LGLYPH_SET_ADJUSTMENT (lglyph, vec); } else