From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Arabic support Date: Wed, 01 Sep 2010 20:55:24 +0300 Message-ID: <8339ttjqlv.fsf@gnu.org> References: <83bp8oml9c.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1283363665 18852 80.91.229.12 (1 Sep 2010 17:54:25 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 1 Sep 2010 17:54:25 +0000 (UTC) Cc: emacs-devel@gnu.org, jasonr@gnu.org To: Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Sep 01 19:54:23 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OqrVs-00020G-Kl for ged-emacs-devel@m.gmane.org; Wed, 01 Sep 2010 19:54:20 +0200 Original-Received: from localhost ([127.0.0.1]:54510 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OqrVs-0005lz-3J for ged-emacs-devel@m.gmane.org; Wed, 01 Sep 2010 13:54:20 -0400 Original-Received: from [140.186.70.92] (port=52191 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OqrVm-0005kN-17 for emacs-devel@gnu.org; Wed, 01 Sep 2010 13:54:15 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OqrVk-0007IA-G5 for emacs-devel@gnu.org; Wed, 01 Sep 2010 13:54:13 -0400 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:40561) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OqrVk-0007Hz-8M; Wed, 01 Sep 2010 13:54:12 -0400 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0L8200000XLV3200@a-mtaout20.012.net.il>; Wed, 01 Sep 2010 20:53:17 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([77.126.202.93]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0L8200KGAXOSMTS0@a-mtaout20.012.net.il>; Wed, 01 Sep 2010 20:53:17 +0300 (IDT) In-reply-to: X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129557 Archived-At: > From: Kenichi Handa > Cc: jasonr@gnu.org, emacs-bidi@gnu.org, emacs-devel@gnu.org > Date: Wed, 01 Sep 2010 16:08:50 +0900 > > No, LGSTRING may contain multiple grapheme clusters. In the > case of arabic, we make LGSTRING for one Arabic word then > shape it (otherwise, the shaper can't know where in a word a > consonant appears). So, usually LGSTRING contains multiple > grapheme clusters for Arabic. I indeed see under a debugger that the variable rtl gets a negative value when HELLO is displayed, which means uniscribe_shape tries to reorder the glyphs, which is wrong, because they are already reordered by xdisp.c. But there's something else at work here, because even if I force rtl to be always 1, the display is still wrong and only slightly different. Also, it looks like uniscribe_shape is repeatedly called from font-shape-gstring to shape the same text that is progressively shortened. For example, the first call will be with a 7-character string whose contents is {0x627, 0x644, 0x633, 0x651, 0x644, 0x627, 0x645} The next call is with a 6-character string whose contents is {0x627, 0x644, 0x633, 0x651, 0x644, 0x627} then a 5-character string {0x627, 0x644, 0x633, 0x651, 0x644}, etc. Note that the first 7-character string is the first word of the Arabic greeting, properly bidi-reordered for display. Are these series of calls expected?