From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel,gmane.emacs.bidi Subject: Re: Arabic support Date: Thu, 02 Sep 2010 10:04:45 -0400 Message-ID: References: <83bp8oml9c.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1283437775 505 80.91.229.12 (2 Sep 2010 14:29:35 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 2 Sep 2010 14:29:35 +0000 (UTC) Cc: emacs-bidi@gnu.org, emacs-devel@gnu.org, jasonr@gnu.org To: Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Sep 02 16:29:29 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OrAnB-0007kg-H0 for ged-emacs-devel@m.gmane.org; Thu, 02 Sep 2010 16:29:29 +0200 Original-Received: from localhost ([127.0.0.1]:32998 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OrAlA-0005H6-MA for ged-emacs-devel@m.gmane.org; Thu, 02 Sep 2010 10:27:24 -0400 Original-Received: from [199.232.76.173] (port=59783 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OrAbr-0000gB-4g for emacs-devel@gnu.org; Thu, 02 Sep 2010 10:17:47 -0400 Original-Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1OrAbm-0003Wy-O3 for emacs-devel@gnu.org; Thu, 02 Sep 2010 10:17:46 -0400 Original-Received: from fencepost.gnu.org ([140.186.70.10]:57862) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1OrAbi-0003Vn-Fn; Thu, 02 Sep 2010 10:17:38 -0400 Original-Received: from eliz by fencepost.gnu.org with local (Exim 4.69) (envelope-from ) id 1OrAPF-0000Gn-K7; Thu, 02 Sep 2010 10:04:45 -0400 In-reply-to: (message from Kenichi Handa on Thu, 02 Sep 2010 22:01:07 +0900) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129593 gmane.emacs.bidi:803 Archived-At: > From: Kenichi Handa > Cc: jasonr@gnu.org, emacs-bidi@gnu.org, emacs-devel@gnu.org > Date: Thu, 02 Sep 2010 22:01:07 +0900 > > A not-yet-shaped LGSTRING is created by autocmp_chars > (composite.c) from a character sequence matching with a > regular expression PATTERN stored in a > composition-function-table. This pattern is > "[\u0600-\u06FF]+" for Arabic (lisp/language/misc-lang.el), > and a more complicated regex for Hebrew > (lisp/language/hebrew.el). Thanks. So character compositions are used not only to compose several characters into one glyph, but also to break text into individually shaped chunks, is that right? If so, auto-composition-mode cannot be turned off for scripts that need this kind of "grouped shaping" without degrading the presentation of these scripts to the point of illegibility? > > I'm asking because it's possible that we will need to modify > > w32uniscribe.c to reorder R2L characters before we pass them to the > > Uniscribe ScriptShape API, to let it see the characters in the logical > > order it expects them. That's if it turns out that Uniscribe cannot > > otherwise shape them correctly. > > ??? Currently characters and glyphs in LGSTRING are always > in logical order. See my mail from yesterday, where I describe that I see in GDB that Arabic characters in LGSTRINGs arrive to uniscribe_shape in visual order: http://lists.gnu.org/archive/html/emacs-devel/2010-09/msg00029.html That is why I asked the question in the first place. What am I missing?