unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: help-gnu-emacs@gnu.org
Subject: Re: Adding Tai Tham Script to GNU/Linux Distribution's Version of Emacs
Date: Mon, 16 Mar 2015 18:34:03 +0200	[thread overview]
Message-ID: <83zj7czytg.fsf@gnu.org> (raw)
In-Reply-To: <20150316074734.410bd2b6@JRWUBU2>

> Date: Mon, 16 Mar 2015 07:47:34 +0000
> From: Richard Wordingham <richard.wordingham@ntlworld.com>
> 
> The primary problem is that m17n currently does not have support for the
> Tai Tham script, not even at Version 1.7.0.  I therefore need to add
> this to my copy of the m17n database, and, if I do an acceptable job,
> offer this support for inclusion in the public m17n database.  I've
> added a first attempt to my copy, but I see no evidence that Emacs is
> attempting to use it. (I've enabled full m17n debugging by setting
> environment variable MDEBUG_ALL, and get lots of output to the terminal
> I launch Emacs from, including a record of my Tai Tham shaping rules
> being read in.) Unfortunately, writing a test bed for m17n does not seem
> simple - I had hoped to use Emacs as the test bed, as the rest of the system is
> often the quickest test bed to build. (It's for Emacs that I want m17n
> support - HarfBuzz provides support for browsers and word processing.)
> I've asked for advice on purely m17n matters on the m17n help list.

Yes, questions about m17n libraries are best asked there.

> > > Might the character categories be relevant?  They don't seem to be
> > > set for Tai Tham characters, though they are set for Tai Viet
> > > characters, which are slightly younger in Unicode.
> 
> > Which categories did you have in mind?  "Category" is too general a
> > term here.
> 
> I meant category as being set by calls of modify-category-entry in
> characters.el.  (The lisp function is defined in file category.c.)  The
> clue is that somehow Emacs knows that text shaping is not needed for a
> sequence of Thai consonants, and it is also true that it is not needed
> for a sequence of Tai Tham consonants.  However, I don't know where the
> triggering logic is.  It *could* be something like 'invoke shaping if
> there is a combining character or Indian character present'.  I'm not
> even sure what an 'Indian character' (category codes 'i' and 'I') is.
> It might merely be a non-ASCII character supported by ISCII.  Another
> clue is that shaping seems to be invoked for lone Tibetan consonants.

OK, now I understand the issue well enough to actually try talking
intelligently about it ;-)  Thanks for taking time to explain it, and
sorry I didn't catch that earlier.

First, a caveat: I'm not enough of an expert on these matters in
Emacs, so please take what's below with a grain of salt, and expect to
have to experiment to at least some extent.  Also, please excuse if I
describe below features you already know about.

That said, I don't think character categories are the root cause of
your problem, at least not directly.  Instead, please take a look at
the language-specific files in lisp/language/, e.g. tibetan.el,
thai.el, and thai-util.el: this is where Emacs defines data and code
required for CTL of these languages.

In Emacs parlance, CTL is called "character composition".  The rules
for character composition are stored in composition-function-table,
whose doc string describes its contents.  Using these rules, Emacs
calls the shaping engine with more than one character when they need
to be shaped as a single entity (a.k.a. "grapheme cluster").  The
shaping engine then returns one or more glyphs for Emacs to display.

The default composition rules are defined in lisp/composite.el
(towards the second half of the file), and files in lisp/language/ add
rules for specific languages.  (Those rules sometimes use character
categories, and sometimes even invent new categories; see thai-util.el
as an example.)  I believe you will see there the data that allows
Emacs to perform CTL for Thai and Tibetan.

My guess is that you will have to write a tai-tham.el file with
similar data for Tai Tham, and then load it into Emacs.  All the rest
should "just work".

Now, if all of the above doesn't help, then I suggest to write to
Kenichi Handa, who wrote most of the related code in Emacs, and who I
believe is also reading the m17n lists (he is one of the developers of
those libraries).

HTH



  reply	other threads:[~2015-03-16 16:34 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <mailman.28836.1426477412.31050.help-gnu-emacs@gnu.org>
2015-03-16  7:47 ` Adding Tai Tham Script to GNU/Linux Distribution's Version of Emacs Richard Wordingham
2015-03-16 16:34   ` Eli Zaretskii [this message]
2015-03-17  0:23     ` Richard Wordingham
2015-03-17  7:30       ` Eli Zaretskii
2015-03-18  8:27     ` Richard Wordingham
2015-03-18 21:33       ` Richard Wordingham
2015-03-19  3:41         ` Eli Zaretskii
2015-03-19  7:37           ` Richard Wordingham
2015-03-20  1:18             ` Richard Wordingham
2015-03-15 23:15 Richard Wordingham
2015-03-16  3:43 ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2015-03-15 12:32 Richard Wordingham
2015-03-15 16:56 ` Eli Zaretskii
2015-03-15 17:01   ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83zj7czytg.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).