From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.help Subject: Re: Adding Tai Tham Script to GNU/Linux Distribution's Version of Emacs Date: Mon, 16 Mar 2015 18:34:03 +0200 Message-ID: <83zj7czytg.fsf@gnu.org> References: <20150316074734.410bd2b6@JRWUBU2> NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1426523686 16806 80.91.229.3 (16 Mar 2015 16:34:46 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 16 Mar 2015 16:34:46 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon Mar 16 17:34:35 2015 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YXXyN-0003bj-5i for geh-help-gnu-emacs@m.gmane.org; Mon, 16 Mar 2015 17:34:35 +0100 Original-Received: from localhost ([::1]:50527 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YXXyM-0008Qt-FB for geh-help-gnu-emacs@m.gmane.org; Mon, 16 Mar 2015 12:34:34 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:47118) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YXXyB-0008Ql-5b for help-gnu-emacs@gnu.org; Mon, 16 Mar 2015 12:34:24 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YXXy6-000201-6I for help-gnu-emacs@gnu.org; Mon, 16 Mar 2015 12:34:23 -0400 Original-Received: from mtaout23.012.net.il ([80.179.55.175]:59469) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YXXy5-0001zw-Uq for help-gnu-emacs@gnu.org; Mon, 16 Mar 2015 12:34:18 -0400 Original-Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0NLB00E00CF3R300@a-mtaout23.012.net.il> for help-gnu-emacs@gnu.org; Mon, 16 Mar 2015 18:34:16 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NLB00E20CP4PZ30@a-mtaout23.012.net.il> for help-gnu-emacs@gnu.org; Mon, 16 Mar 2015 18:34:16 +0200 (IST) In-reply-to: <20150316074734.410bd2b6@JRWUBU2> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 X-Received-From: 80.179.55.175 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:103186 Archived-At: > Date: Mon, 16 Mar 2015 07:47:34 +0000 > From: Richard Wordingham > > The primary problem is that m17n currently does not have support for the > Tai Tham script, not even at Version 1.7.0. I therefore need to add > this to my copy of the m17n database, and, if I do an acceptable job, > offer this support for inclusion in the public m17n database. I've > added a first attempt to my copy, but I see no evidence that Emacs is > attempting to use it. (I've enabled full m17n debugging by setting > environment variable MDEBUG_ALL, and get lots of output to the terminal > I launch Emacs from, including a record of my Tai Tham shaping rules > being read in.) Unfortunately, writing a test bed for m17n does not seem > simple - I had hoped to use Emacs as the test bed, as the rest of the system is > often the quickest test bed to build. (It's for Emacs that I want m17n > support - HarfBuzz provides support for browsers and word processing.) > I've asked for advice on purely m17n matters on the m17n help list. Yes, questions about m17n libraries are best asked there. > > > Might the character categories be relevant? They don't seem to be > > > set for Tai Tham characters, though they are set for Tai Viet > > > characters, which are slightly younger in Unicode. > > > Which categories did you have in mind? "Category" is too general a > > term here. > > I meant category as being set by calls of modify-category-entry in > characters.el. (The lisp function is defined in file category.c.) The > clue is that somehow Emacs knows that text shaping is not needed for a > sequence of Thai consonants, and it is also true that it is not needed > for a sequence of Tai Tham consonants. However, I don't know where the > triggering logic is. It *could* be something like 'invoke shaping if > there is a combining character or Indian character present'. I'm not > even sure what an 'Indian character' (category codes 'i' and 'I') is. > It might merely be a non-ASCII character supported by ISCII. Another > clue is that shaping seems to be invoked for lone Tibetan consonants. OK, now I understand the issue well enough to actually try talking intelligently about it ;-) Thanks for taking time to explain it, and sorry I didn't catch that earlier. First, a caveat: I'm not enough of an expert on these matters in Emacs, so please take what's below with a grain of salt, and expect to have to experiment to at least some extent. Also, please excuse if I describe below features you already know about. That said, I don't think character categories are the root cause of your problem, at least not directly. Instead, please take a look at the language-specific files in lisp/language/, e.g. tibetan.el, thai.el, and thai-util.el: this is where Emacs defines data and code required for CTL of these languages. In Emacs parlance, CTL is called "character composition". The rules for character composition are stored in composition-function-table, whose doc string describes its contents. Using these rules, Emacs calls the shaping engine with more than one character when they need to be shaped as a single entity (a.k.a. "grapheme cluster"). The shaping engine then returns one or more glyphs for Emacs to display. The default composition rules are defined in lisp/composite.el (towards the second half of the file), and files in lisp/language/ add rules for specific languages. (Those rules sometimes use character categories, and sometimes even invent new categories; see thai-util.el as an example.) I believe you will see there the data that allows Emacs to perform CTL for Thai and Tibetan. My guess is that you will have to write a tai-tham.el file with similar data for Tai Tham, and then load it into Emacs. All the rest should "just work". Now, if all of the above doesn't help, then I suggest to write to Kenichi Handa, who wrote most of the related code in Emacs, and who I believe is also reading the m17n lists (he is one of the developers of those libraries). HTH