From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear Date: Sat, 18 Aug 2012 18:33:21 +0300 Message-ID: <837gswgqpq.fsf@gnu.org> References: <87k3wwimlk.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1345304048 30165 80.91.229.3 (18 Aug 2012 15:34:08 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 18 Aug 2012 15:34:08 +0000 (UTC) Cc: 11860@debbugs.gnu.org, smias@yandex.ru To: Kenichi Handa Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Aug 18 17:34:08 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1T2l2K-0004Rj-Uv for geb-bug-gnu-emacs@m.gmane.org; Sat, 18 Aug 2012 17:34:05 +0200 Original-Received: from localhost ([::1]:33459 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T2l2J-00055J-QL for geb-bug-gnu-emacs@m.gmane.org; Sat, 18 Aug 2012 11:34:03 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:36909) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T2l2H-00055B-3Z for bug-gnu-emacs@gnu.org; Sat, 18 Aug 2012 11:34:02 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T2l2F-0005zC-Q3 for bug-gnu-emacs@gnu.org; Sat, 18 Aug 2012 11:34:01 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:56285) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T2l2F-0005z8-M4 for bug-gnu-emacs@gnu.org; Sat, 18 Aug 2012 11:33:59 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1T2l2I-0008VH-30 for bug-gnu-emacs@gnu.org; Sat, 18 Aug 2012 11:34:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 18 Aug 2012 15:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 11860 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 11860-submit@debbugs.gnu.org id=B11860.134530399532635 (code B ref 11860); Sat, 18 Aug 2012 15:34:02 +0000 Original-Received: (at 11860) by debbugs.gnu.org; 18 Aug 2012 15:33:15 +0000 Original-Received: from localhost ([127.0.0.1]:37598 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1T2l1X-0008UJ-70 for submit@debbugs.gnu.org; Sat, 18 Aug 2012 11:33:15 -0400 Original-Received: from mtaout23.012.net.il ([80.179.55.175]:47444) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1T2l1U-0008UA-He for 11860@debbugs.gnu.org; Sat, 18 Aug 2012 11:33:14 -0400 Original-Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0M8Y00I00IWO1200@a-mtaout23.012.net.il> for 11860@debbugs.gnu.org; Sat, 18 Aug 2012 18:33:08 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0M8Y00HLOJ77Y720@a-mtaout23.012.net.il>; Sat, 18 Aug 2012 18:33:08 +0300 (IDT) In-reply-to: <87k3wwimlk.fsf@gnu.org> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:63268 Archived-At: > From: Kenichi Handa > Cc: 11860@debbugs.gnu.org, smias@yandex.ru, handa@gnu.org > Date: Sat, 18 Aug 2012 18:19:19 +0900 > > > If this is the case, how come we display the diacriticals correctly on > > Windows in other cases, e.g. with Hebrew? > > For Hebrew too, on Windows, I see the same problem as what > Steffan reported: > > In article <349641344144469@web8d.yandex.ru>, Steffan writes: > >>> I choose "hebrew-full" as input-method. > >>> > >>> - After typing 'f' I get KAF > >>> - then by typing d I get GIMMEL > >>> - and after typing 'D' I get "the three point sign" (HEBREW POINT QUBUTS) not below the GIMMEL but the KAF! > > If you don't face with that problem, perhaps we are using > the different font. C-u C-x = tells that "courier new" is > used for hebrew too in my case. "Courier New" is the font that is used, and I still don't see the problem. The HEBREW POINT QUBUTS is displayed below GIMEL, as I'd expect. > I've just read the function uniscribe_shape in > w32uniscribe.c. It seems that these are the key API for > uniscribe: > > * ScriptItemize -- no idea what is this It breaks the string to be displayed into individually shapeable chunks, called "items". We then pass each chunk to Uniscribe separately for shaping. > * ScriptShape -- perhaps for glyph substitution (GSUB features of opentype) http://msdn.microsoft.com/en-us/library/windows/desktop/dd368564%28v=vs.85%29.aspx says that this function "Generates glyphs and visual attributes for a Unicode run". > * ScriptPlace -- perhaps for glyph positioning (GPOS features of opentype) > > So at first please check the documentation of ScriptShape > and figure out how it works for bidi script; i.e. what order > does it expect for input, and what order does it produce. >From the above page: If fLogicalOrder is set to TRUE in the SCRIPT_ANALYSIS structure, the function always generates glyphs in the same order as the original Unicode characters. If fLogicalOrder is set to FALSE, the function generates right-to-left items in reverse order so that ScriptTextOut does not have to reverse them before calling ExtTextOut. And w32uniscribe.c sets that flag to TRUE a few lines before it calls ScriptShape, because Emacs itself reorders characters: for (i = 0; i < nitems; i++) { int nglyphs, nchars_in_run; nchars_in_run = items[i+1].iCharPos - items[i].iCharPos; /* Force ScriptShape to generate glyphs in the same order as they are in the input LGSTRING, which is in the logical order. */ items[i].a.fLogicalOrder = 1; <<<<<<<<<<<<<<<<<<<<<<<< /* Context may be NULL here, in which case the cache should be used without needing to select the font. */ result = ScriptShape (context, &(uniscribe_font->cache), chars + items[i].iCharPos, nchars_in_run, max_glyphs - done_glyphs, &(items[i].a), glyphs, clusters, attributes, &nglyphs); > Next please find the meaning of this code fragment: > > /* Detect clusters, for linking codes back to > characters. */ > if (attributes[j].fClusterStart) > { > while (from < nchars_in_run && clusters[from] < j) > from++; > if (from >= nchars_in_run) > from = to = nchars_in_run - 1; > else > { > int k; > to = nchars_in_run - 1; > for (k = from + 1; k < nchars_in_run; k++) > { > if (clusters[k] > j) > { > to = k - 1; > break; > } > } > } > } > > The comment refer to "clusters". I don't know what it > exactly means in uniscribe, but I guess it relates to > grapheme cluster, and if so, this part seems to relates to > the ordering of glyphs in this kind of grapheme clauster: > > [0 1 1593 969 8 1 8 12 4 nil] > [0 1 1593 760 0 3 6 12 4 [1 -2 0]] No, they are character clusters, not grapheme clusters. They could be similar (or even identical) to grapheme clusters, but I'm not sure, because I have a very vague idea about both. You can find some details here: http://msdn.microsoft.com/en-us/library/windows/desktop/dd317792%28v=vs.85%29.aspx I hope this will allow you to understand the meaning of the above code, by looking at how the results are used in the calls to LGLYPH_SET_* macros right below the above snippet. Thanks.