From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.bugs Subject: bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear Date: Sat, 18 Aug 2012 11:45:27 +0900 Message-ID: <87393kgbp4.fsf@gnu.org> References: <349071341393469@web30d.yandex.ru> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1345257972 23903 80.91.229.3 (18 Aug 2012 02:46:12 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 18 Aug 2012 02:46:12 +0000 (UTC) Cc: 11860@debbugs.gnu.org, smias@yandex.ru To: Kenichi Handa Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Aug 18 04:46:12 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1T2Z39-00059L-NJ for geb-bug-gnu-emacs@m.gmane.org; Sat, 18 Aug 2012 04:46:07 +0200 Original-Received: from localhost ([::1]:50211 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T2Z38-0005i8-JK for geb-bug-gnu-emacs@m.gmane.org; Fri, 17 Aug 2012 22:46:06 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:35053) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T2Z36-0005i3-Dl for bug-gnu-emacs@gnu.org; Fri, 17 Aug 2012 22:46:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T2Z35-0002kB-At for bug-gnu-emacs@gnu.org; Fri, 17 Aug 2012 22:46:04 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:54958) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T2Z35-0002k7-74 for bug-gnu-emacs@gnu.org; Fri, 17 Aug 2012 22:46:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1T2Z34-0005gR-CO for bug-gnu-emacs@gnu.org; Fri, 17 Aug 2012 22:46:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Kenichi Handa Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 18 Aug 2012 02:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 11860 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 11860-submit@debbugs.gnu.org id=B11860.134525795421828 (code B ref 11860); Sat, 18 Aug 2012 02:46:02 +0000 Original-Received: (at 11860) by debbugs.gnu.org; 18 Aug 2012 02:45:54 +0000 Original-Received: from localhost ([127.0.0.1]:36271 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1T2Z2w-0005g0-8p for submit@debbugs.gnu.org; Fri, 17 Aug 2012 22:45:54 -0400 Original-Received: from fencepost.gnu.org ([208.118.235.10]:46165) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1T2Z2t-0005fs-7R for 11860@debbugs.gnu.org; Fri, 17 Aug 2012 22:45:52 -0400 Original-Received: from [150.29.149.7] (port=63005 helo=ubuntu) by fencepost.gnu.org with esmtpa (Exim 4.71) (envelope-from ) id 1T2Z2r-0005q8-QU; Fri, 17 Aug 2012 22:45:50 -0400 In-Reply-To: (message from Kenichi Handa on Mon, 13 Aug 2012 09:02:06 +0900) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:63257 Archived-At: In article , Kenichi Handa writes: > I'm very sorry for the late response. I was just back from > Europe. I'll start investigating this problem soon. I first confirmed that the described problems of Arabic and Hebrew occur with Emacs running on Windows. Typing C-u C-x =3D on the first Arabic character (U+0639) showed that "Courier New" font is used for it, and showed this composition information. Composed with the following character(s) "=D9=92" using this font: uniscribe:-outline-Courier New-normal-normal-normal-mono-13-*-*-*-c-*-iso= 10646-1 by these glyphs: [0 1 1593 969 8 1 8 12 4 nil] [0 1 1593 760 0 3 6 12 4 [1 -2 0]] Next, I used the same "Courier New" font on GNU/Linux, and specified it for Arabic as this with Emacs running on GNU/Linux: (set-fontset-font t 'arabic '("courier new" . "unicode-bmp")) With this setting, Emacs correctly displayed Arabic, and typing C-u C-x =3D on U+0639 showed this composition information. Composed with the following character(s) "=D9=92" using this font: xft:-monotype-Courier New-normal-normal-normal-*-13-*-*-*-m-0-iso10646-1 by these glyphs: [0 1 1593 969 8 2 8 4 4 nil] [0 1 1618 760 0 -6 -3 8 -11 [-9 2 0]] Each vector is a GLYPH described in the docstring of composition-get-gstring as this: ---------------------------------------------------------------------- GLYPH is a vector whose elements have this form: [ FROM-IDX TO-IDX C CODE WIDTH LBEARING RBEARING ASCENT DESCENT [ [X-OFF Y-OFF WADJUST] | nil] ] where FROM-IDX and TO-IDX are used internally and should not be touched. C is the character of the glyph. CODE is the glyph-code of C in FONT-OBJECT. WIDTH thru DESCENT are the metrics (in pixels) of the glyph. X-OFF and Y-OFF are offsets to the base position for the glyph. WADJUST is the adjustment to the normal width of the glyph. ---------------------------------------------------------------------- So, apparently Emacs on Windows and GNU/Linux uses the different metrics of glyphs. As the shaper on GNU/Linux (m17n-lib library) works correctly for the same font, and the other applications on Windows have no problem, I suspect that the problem is in Emacs' interface with uniscribe (w32font.c or w32uniscribe.c). If this problem happens only for bidi scripts, one possibility is that Emacs's rendering engine (xdisp.c) expects glyphs in a glyph-string are rendered in that order from left to right, but the returned glyph-string on Windows should be rendered in reverse order. For instance, in the above case, we may have to render glyphs in this order (diacritical mark first): [0 1 1593 760 0 3 6 12 4 [1 -2 0]] [0 1 1593 969 8 1 8 12 4 nil] I think the further debugging must be done by those who knows uniscribe, w32font.c, and w32uniscribe.c. --- Kenichi Handa handa@gnu.org