bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
@ 2011-08-21 16:47 Jambunathan K
  2011-08-23  4:18 ` Kenichi Handa
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Jambunathan K @ 2011-08-21 16:47 UTC (permalink / raw)
  To: 9336

[-- Attachment #1: Type: text/plain, Size: 2467 bytes --]


Summary:

Using input method tamil-itrans, there is no easy way to input character
#xbb4.

I am a native speaker of this language and this particular character is
a commonly used character in the language. Absence of a convenient way
to input this character is definitely an annoyance and a bug.

I am attached a unicode map for tamil with this bug report, if that is
of any help.

M-x set-language-environment RET tamil RET
M-x set-input-method RET tamil-trans RET

M-x quail-help RET

I see no character combinations by which I can insert the character 0BB4
(Tamil Letter LLLA)

If I do

M-x ucs-insert RET TAMIL LETTER LLLA RET 

I can insert the intended character without any problems.

C-u C-x = on the above "non inputtable" character shows me this. In
particular I see no "to input" entry.

,----
|         character: ழ (2996, #o5664, #xbb4)
| preferred charset: unicode (Unicode (ISO10646))
|        code point: 0x0BB4
|            syntax: w 	which means: word
|          category: .:Base
|       buffer code: #xE0 #xAE #xB4
|         file code: #xE0 #xAE #xB4 (encoded by coding system utf-8-dos)
|           display: by this font (glyph code)
|     uniscribe:-outline-Latha-normal-normal-normal-*-20-*-*-*-p-*-iso10646-1 (#x53)
| 
| Character code properties: customize what to show
|   name: TAMIL LETTER LLLA
|   general-category: Lo (Letter, Other)
| 
| There are text properties here:
|   fontified            t
`----

On the other hand if I do a C-u C-x = on an inputtable character say
0B95 (TAMIL LETTER KA) I see a "to input" advice displayed. So I am
pretty convinced that there really is no straighforward manner by which
I can insert character LLLA and it's variants. 

,----
|         character: க (2965, #o5625, #xb95)
| preferred charset: unicode (Unicode (ISO10646))
|        code point: 0x0B95
|            syntax: w 	which means: word
|          category: .:Base
|          to input: type "ka" with tamil-itrans
|       buffer code: #xE0 #xAE #x95
|         file code: #xE0 #xAE #x95 (encoded by coding system utf-8-dos)
|           display: by this font (glyph code)
|     uniscribe:-outline-Latha-normal-normal-normal-*-20-*-*-*-p-*-iso10646-1 (#x42)
| 
| Character code properties: customize what to show
|   name: TAMIL LETTER KA
|   general-category: Lo (Letter, Other)
| 
| There are text properties here:
|   fontified            t
| 
| [back]
`----


[-- Attachment #2: U0B80.pdf --]
[-- Type: application/pdf, Size: 146051 bytes --]

[-- Attachment #3: Type: text/plain, Size: 529 bytes --]



In GNU Emacs 24.0.50.1 (i386-mingw-nt5.1.2600)
 of 2011-08-09 on 3249CTO
Windowing system distributor `Microsoft Corp.', version 5.1.2600
configured using `configure --with-gcc (4.5) --no-opt'

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: ENG
  value of $XMODIFIERS: nil
  locale-coding-system: cp1252
  default enable-multibyte-characters: t



^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-08-21 16:47 bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans Jambunathan K
@ 2011-08-23  4:18 ` Kenichi Handa
  2011-08-23  7:23   ` Jambunathan K
  2011-09-15 12:55 ` Jambunathan K
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Kenichi Handa @ 2011-08-23  4:18 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

In article <81aab2socv.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:

> Using input method tamil-itrans, there is no easy way to input character
> #xbb4.

It seems that usually "J" or "z" is assigned for that
character in the Tamil itrans system, right?

If so, please try the attached patch.  Does it work?

---
Kenichi Handa
handa@m17n.org

=== modified file 'lisp/language/ind-util.el'
--- lisp/language/ind-util.el	2011-01-26 08:36:39 +0000
+++ lisp/language/ind-util.el	2011-08-23 04:10:08 +0000
@@ -305,6 +305,25 @@
     (;; misc -- 7
      ".N" (".n" "M") "H" ".a" ".h" ("AUM" "OM") "..")))
 
+(defvar indian-itrans-v5-table-for-tamil
+  '(;; for encode/decode
+    (;; vowels -- 18
+     "a" ("aa" "A") "i" ("ii" "I") "u" ("uu" "U")
+     ("RRi" "R^i") ("LLi" "L^i") (".c" "e.c") "E" "e" "ai"
+     "o.c"  "O"   "o"   "au"  ("RRI" "R^I") ("LLI" "L^I"))
+    (;; consonants -- 40
+     "k"   "kh"  "g"   "gh"  ("~N" "N^")
+     "ch" ("Ch" "chh") "j" "jh" ("~n" "JN")
+     "T"   "Th"  "D"   "Dh"  "N"
+     "t"   "th"  "d"   "dh"  "n"   "nh"
+     "p"   "ph"  "b"   "bh"  "m"
+     "y"   "r"   "rh"  "l"   ("L" "ld") ("J" "z")  ("v" "w")
+     "sh" ("Sh" "shh") "s" "h"
+     "q" "K" "G" nil ".D" ".Dh" "f" ("Y" "yh")
+     ("GY" "dny") "x")
+    (;; misc -- 7
+     ".N" (".n" "M") "H" ".a" ".h" ("AUM" "OM") "..")))
+
 (defvar indian-kyoto-harvard-table
   '(;; for encode/decode
     (;; vowel
@@ -508,7 +527,7 @@
 
 (defvar indian-tml-itrans-v5-hash
   (indian-make-hash indian-tml-base-table
-			  indian-itrans-v5-table))
+			  indian-itrans-v5-table-for-tamil))
 )
 
 (defmacro indian-translate-region (from to hashtable encode-p)






^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-08-23  4:18 ` Kenichi Handa
@ 2011-08-23  7:23   ` Jambunathan K
  2011-08-23  7:55     ` Jambunathan K
  2011-08-25  4:27     ` Kenichi Handa
  0 siblings, 2 replies; 21+ messages in thread
From: Jambunathan K @ 2011-08-23  7:23 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9336

Kenichi Handa <handa@m17n.org> writes:

> In article <81aab2socv.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:
>
>> Using input method tamil-itrans, there is no easy way to input character
>> #xbb4.
>
> It seems that usually "J" or "z" is assigned for that
> character in the Tamil itrans system, right?
>
> If so, please try the attached patch.  Does it work?

The attached patch works fine.

I was trying tamil-itrans for the first time and I find quail-help to be
a bit overwhelming. There are lots of characters that is displayed and
it is difficult to find the one that I am interested in inputting. In
some sense it is very confusing for first time user.

I believe it would be lot more helpful if the input table for vowels and
consonants be displayed upfront. I am attaching a table that I created
for myself during the process. You will find that the characters are
displayed in their natural order.

Independent Vowels
==================
a 	அ
aa 	ஆ
i 	இ
I 	ஈ
u 	உ
U 	ஊ
e 	எ
E 	ஏ
ai 	ஐ
o 	ஒ
O 	ஓ
au 	ஔ

Consonants
==========

k	க்
N^	ங் 
ch	ச்
~n, JN	ஞ்
T	ட் 
N	ண் 
t	த்
n	ந் 
p	ப்  
m	ம்
y	ய்
r	ர் 
l	ல்
z, J	ழ
L	ள் 
rh	ற்
v	வ் 
nh	ன் 

Very rarely used 
================

No input methods for these. In general do they have one?

ஷ
ஸ
ஹ

^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-08-23  7:23   ` Jambunathan K
@ 2011-08-23  7:55     ` Jambunathan K
  2011-08-25  4:27     ` Kenichi Handa
  1 sibling, 0 replies; 21+ messages in thread
From: Jambunathan K @ 2011-08-23  7:55 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9336

> Very rarely used 
> ================
>
> No input methods for these. In general do they have one?
>
> ஷ
> ஸ
> ஹ

I made a mistake here. Looks like I did C-u C-x = while the input method
was English and naturally the "to input" field was not displayed.

Here is the table for these characters.

Very rarely used 
================

Sh,shh	ஷ்
s	ஸ்
h	ஹ்





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-08-23  7:23   ` Jambunathan K
  2011-08-23  7:55     ` Jambunathan K
@ 2011-08-25  4:27     ` Kenichi Handa
  1 sibling, 0 replies; 21+ messages in thread
From: Kenichi Handa @ 2011-08-25  4:27 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

In article <81vcto4mn5.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:

> Kenichi Handa <handa@m17n.org> writes:
> > In article <81aab2socv.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:
> >
>>> Using input method tamil-itrans, there is no easy way to input character
>>> #xbb4.
> >
> > It seems that usually "J" or "z" is assigned for that
> > character in the Tamil itrans system, right?
> >
> > If so, please try the attached patch.  Does it work?

> The attached patch works fine.

Thank you for testing it.  I'll commit that change soon.

> I was trying tamil-itrans for the first time and I find quail-help to be
> a bit overwhelming. There are lots of characters that is displayed and
> it is difficult to find the one that I am interested in inputting. In
> some sense it is very confusing for first time user.

> I believe it would be lot more helpful if the input table for vowels and
> consonants be displayed upfront. I am attaching a table that I created
> for myself during the process. You will find that the characters are
> displayed in their natural order.

The current descriptions for *-itrans input methods are what
generated automatically from the corresponding mapping
table, so they may not be sufficient for users.  Thank you
for the suggestion.  I'll add it to the description of
tamil-itrans.  By the way, isn't it necessary to have a
column for dependent vowels?  Or, is to obvious for users
from independent vowels?

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-08-21 16:47 bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans Jambunathan K
  2011-08-23  4:18 ` Kenichi Handa
@ 2011-09-15 12:55 ` Jambunathan K
  2011-09-16  7:26   ` Kenichi Handa
  2011-09-26 13:53 ` Jambunathan K
  2011-09-28 13:00 ` Jambunathan K
  3 siblings, 1 reply; 21+ messages in thread
From: Jambunathan K @ 2011-09-15 12:55 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9336

Kenichi Handa <handa@m17n.org> writes:

> In article <81vcto4mn5.fsf@gmail.com>, Jambunathan K
> <kjambunathan@gmail.com> writes:
>
>> Kenichi Handa <handa@m17n.org> writes:
>> > In article <81aab2socv.fsf@gmail.com>, Jambunathan K
>> > <kjambunathan@gmail.com> writes:
>> >
>>>> Using input method tamil-itrans, there is no easy way to input character
>>>> #xbb4.
>> >
>> > It seems that usually "J" or "z" is assigned for that
>> > character in the Tamil itrans system, right?
>> >
>> > If so, please try the attached patch.  Does it work?
>
>> The attached patch works fine.
>
> Thank you for testing it.  I'll commit that change soon.

>> I was trying tamil-itrans for the first time and I find quail-help to be
>> a bit overwhelming. There are lots of characters that is displayed and
>> it is difficult to find the one that I am interested in inputting. In
>> some sense it is very confusing for first time user.
>
>> I believe it would be lot more helpful if the input table for vowels and
>> consonants be displayed upfront. I am attaching a table that I created
>> for myself during the process. You will find that the characters are
>> displayed in their natural order.
>
> The current descriptions for *-itrans input methods are what
> generated automatically from the corresponding mapping
> table, so they may not be sufficient for users.  Thank you
> for the suggestion.  I'll add it to the description of
> tamil-itrans.  By the way, isn't it necessary to have a
> column for dependent vowels?  Or, is to obvious for users
> from independent vowels?

(Please refer pages 83 and 84 of the pdf file attached earlier)

It would be helpful to include the tables in the order given below. It
would be highly desirable that they are enumerated NOT in the unicode
order but in their natural order (i.e., the order in which we are taught
to recite these alphabets in our schools)

High Usage Characters
=====================

+ Independent Vowels Table + Aytham [1]

  This table will contain the 12 "Independent vowels" FOLLOWED BY the
  letter "Aytham" (0B83). The unicode order for these alphabets
  is the SAME as their natural order (leaving aside those marked reserved)

+ Consonants Table [2]

  This table will contain 18 of the characters listed under
  "Consonants". The unicode order for these is DIFFERENT from their
  natural order (leaving aside those marked reserved)

Some 2D-table representation similar to the one listed under "Unicode
Tamil Syllabary" in the below URL could be used.
"http://en.wikipedia.org/wiki/Tamil_alphabet#Tamil_in_Unicode"

If that is not aesthetically pleasing, then just listing the Row and
Column Labels should suffice. Particularly, I like the fact that the
Vertical Labels (the Row Labels) in the above table use the base form of
Consonants - (i.e., Consonant devoid of it's trailing vowel `a' which
is represented by a "dot" on top of tamil character- I believe this is
achieved by using TAMIL SIGN VIRAMA)

The above two tables plus an an illustration of how to enter tamil text
should suffice to get the point across to the user. So some suggestion
along the lines of:

"To enter வணக்கம் type vaNakkam"

(The word used above is the "HELLO" entry for Tamil)

Medium Usage
============

+ Grantha Letters [3]

  This table will contain the remaining 5 consonants + one compound
  alphabet designated with input method 'x'.

  - I see the letter 0BB6 as a rectangle. Not sure whether this is a bug
    in Emacs or my local installation.

  - I don't know how to enter the character - described as "common
    ligature 'Sri' (ஸ்ரீ Śrī)" - As the Wikipedia entry clearly says it is
    a commonly used character (equivalent to English "Sir/Madam").

    (See "Usage of other lingual consonants" section under previously
    referred to Wikipedia entry)

+ Various/Miscellaneous [4]
  - Not sure what these are called. I have just called it
    miscellaneuous.

+ Dependent Vowel Signs, Two-part dependent Vowel signs
  - I find it difficult to relate to these characters with their
    consonant component greyed out. These NEED NOT BE LISTED as part of
    the help text.

The tables listed below are mostly of interest to very niche users like
language scholars or for traditional uses like local calendars, marriage
invitations, temple festivals etc. They are not used by a layman for
day-to-day communication. In some sense they could be given low
priority. Honestly, speaking I have never used them in my 35 years of
existence. So I don't have any fixed opinions on these. Listing them
here merely for the sake of completion.


Nice/Rare Usage
===============

+ Digits and Numerics [5]

  - I find that I am unable to insert the following alphabets with M-x
    ucs-insert. TAB completion doesn't offer these characters (they are
    listed in the unicode table though)
    - TAMIL NUMBER TEN
    - TAMIL NUMBER ONE HUNDRED
    - TAMIL NUMBER ONE THOUSAND

    Also I find that I am unable to ucs-insert TAMIL DIGIT ZERO. I am
    not sure whether that is same the regular arabic/english `0'.

+ Tamil Symbols [6] (Tamil Symbols, Currency Symbol and Tamil Symbol)
  - When I ucs-insert them, I only see rectangular boxes. Not sure
    whether it is a problem with Emacs distribution or my local
    installation. Just curious, how can I display these characters
    within Emacs.

    Additional Note: Currency symbol is something that I personally
    recognize. May be it could be grouped under "Miscellaneous" table..

+ Reserved 
  - 0BE4, 0BE5 
    
  No idea what these characters are.
    
> ---
> Kenichi Handa
> handa@m17n.org

Footnotes:
[1]  Independent Vowels Table + Aytham (13 items)

0B85, அ, a
0B86, ஆ, A or aa
0B87, இ, i
0B88, ஈ, I or ii
0B89, உ, u
0B8A, ஊ, U or uu
0B8E, எ, e
0B8F, ஏ, E
0B90, ஐ, ai
0B92, ஒ, o
0B93, ஓ, O
0B94, ஔ, au
0B83, ஃ, H

[2]  Consonants Table (18 items)
0B95, க, ka
0B99, ங, N^a or ~Na
0B9A, ச, cha
0B9E, ஞ, JNa or ~na
0B9F, ட, Ta
0BA3, ண, Na
0BA4, த, ta
0BA8, ந, na
0BAA, ப, pa
0BAE, ம, ma
0BAF, ய, ya
0BB0, ர, ra
0BB2, ல, la
0BB5, வ, wa or va
0BB4, ழ,
0BB3, ள, La or lda
0BB1, ற, rha
0BA9, ன, nha

[3]  Grantha Letters (5 + 1 items)

0BB6, ஶ,  ???? 
0B9C, ஜ, ja   
0BB7, ஷ, Sha or shha
0BB8, ஸ, sa
0BB9, ஹ, ha
????, க்ஷ, x 

[4] Various (1 item)

0BD0, ௐ, 

[5] Digits and Numerics

0BE6, ௦௦௦௦, 
0BE7, ௧
0BE8, ௨
0BE9, ௩
0BEA, ௪
0BEB, ௫
0BEC, ௬
0BED, ௭
0BEE, ௮
0BEF, ௯
0BF0, 
0BF1, 
0BF2, 

[6] Tamil Symbols

0BF3, ௳
0BF4, ௴
0BF5, ௵
0BF6, ௶
0BF7, ௷
0BF8, ௸
0BF9, ௹
0BFA, ௺, 





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-15 12:55 ` Jambunathan K
@ 2011-09-16  7:26   ` Kenichi Handa
  2011-09-20 10:51     ` Jambunathan K
  0 siblings, 1 reply; 21+ messages in thread
From: Kenichi Handa @ 2011-09-16  7:26 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

In article <817h5aarsv.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:

> It would be helpful to include the tables in the order given below. It
> would be highly desirable that they are enumerated NOT in the unicode
> order but in their natural order (i.e., the order in which we are taught
> to recite these alphabets in our schools)

Thank you for the detail.  I'll work on it.

> + Grantha Letters [3]

>   This table will contain the remaining 5 consonants + one compound
>   alphabet designated with input method 'x'.

>   - I see the letter 0BB6 as a rectangle. Not sure whether this is a bug
>     in Emacs or my local installation.

I'm using "Lohit Tamil" font (lohit_ta.ttf), and with it,
that character is correctly displayed.  Which font are you
using for Tamil?

>   - I don't know how to enter the character - described as "common
>     ligature 'Sri' (ஸ்ரீ Śrī)" - As the Wikipedia entry clearly says it is
>     a commonly used character (equivalent to English "Sir/Madam").

You can input it by typing "srii" in tamil-itrans.

> + Digits and Numerics [5]

>   - I find that I am unable to insert the following alphabets with M-x
>     ucs-insert. TAB completion doesn't offer these characters (they are
>     listed in the unicode table though)
>     - TAMIL NUMBER TEN
>     - TAMIL NUMBER ONE HUNDRED
>     - TAMIL NUMBER ONE THOUSAND

Strange, both Emacs 23.3 and the latest trunk version works
well for them.  Which version are you using?

>     Also I find that I am unable to ucs-insert TAMIL DIGIT ZERO. I am
>     not sure whether that is same the regular arabic/english `0'.

I have no problem with inputting it by ucs-insert.  That
character is ௦ (U+0BE6).

> + Tamil Symbols [6] (Tamil Symbols, Currency Symbol and Tamil Symbol)
>   - When I ucs-insert them, I only see rectangular boxes. Not sure
>     whether it is a problem with Emacs distribution or my local
>     installation. Just curious, how can I display these characters
>     within Emacs.

Again, Lohit Tamil font has no problem for them.

> Footnotes:
> [1]  Independent Vowels Table + Aytham (13 items)

> 0B85, அ, a
> 0B86, ஆ, A or aa
> 0B88, ஈ, I or ii
> 0B8A, ஊ, U or uu
> 0B99, ங, N^a or ~Na
> 0B9E, ஞ, JNa or ~na
> 0BB5, வ, wa or va
> 0BB3, ள, La or lda

Within 2D table, I think we can show just one key
sequence.  Which is more common/convenient; A or aa, I or ii, ...?

> 0BB4, ழ,

It seems that the above character can't be input with
tamil-itrans.  What key is usually used for that character?

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-16  7:26   ` Kenichi Handa
@ 2011-09-20 10:51     ` Jambunathan K
  2011-09-21  3:45       ` Vijay Lakshminarayanan
  2011-09-22  2:23       ` Kenichi Handa
  0 siblings, 2 replies; 21+ messages in thread
From: Jambunathan K @ 2011-09-20 10:51 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9336


Hello Kenichi,

Kenichi Handa <handa@m17n.org> writes:

> In article <817h5aarsv.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:
>
>> It would be helpful to include the tables in the order given below. It
>> would be highly desirable that they are enumerated NOT in the unicode
>> order but in their natural order (i.e., the order in which we are taught
>> to recite these alphabets in our schools)
>
> Thank you for the detail.  I'll work on it.

>> Footnotes:
>> [1]  Independent Vowels Table + Aytham (13 items)

> Within 2D table, I think we can show just one key
> sequence.  Which is more common/convenient; A or aa, I or ii, ...?

>> 0B85, அ, a
>> 0B86, ஆ, A or aa
>> 0B88, ஈ, I or ii
>> 0B8A, ஊ, U or uu
Please use A, I, U 

(Rationale: I am seeing that there is no `oo' but only a `O'. With the
above choice, one can always enter a capitalized vowel and be assured
that the right thing will happen)

>> 0B99, ங, N^a or ~Na
Please use N^a. 

(Rationale: Better to start with an regular character than a punctuation
character)

>> 0B9E, ஞ, JNa or ~na
Please use JNa. 

(Rationale: Same as before)

>> 0BB5, வ, wa or va
We can use va.

(Rationale: Seems natural)

>> 0BB3, ள, La or lda
We can use La.

(Rationale: The above character doesn't sound like `lda' when spoken)

>> 0BB4, ழ,
>
> It seems that the above character can't be input with
> tamil-itrans.  What key is usually used for that character?

The patch that you had sent earlier works. Could you please commit those
changes? 

Above character has two bindings - "za" or "Ja".
We can go with "za".  

(Rationale: This is how it's usually mapped by a layman even outside of
ta-itrans)

TIA,
Jambunathan K.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-20 10:51     ` Jambunathan K
@ 2011-09-21  3:45       ` Vijay Lakshminarayanan
  2011-09-21  4:14         ` Jambunathan K
  2011-09-22  2:23       ` Kenichi Handa
  1 sibling, 1 reply; 21+ messages in thread
From: Vijay Lakshminarayanan @ 2011-09-21  3:45 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

Jambunathan K <kjambunathan@gmail.com> writes:

> Hello Kenichi,
>
> Kenichi Handa <handa@m17n.org> writes:
>
>> In article <817h5aarsv.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:
>>> 0BB4, ழ,
>>
>> It seems that the above character can't be input with
>> tamil-itrans.  What key is usually used for that character?
>
> The patch that you had sent earlier works. Could you please commit those
> changes? 
>
> Above character has two bindings - "za" or "Ja".
> We can go with "za".  
>
> (Rationale: This is how it's usually mapped by a layman even outside of
> ta-itrans)

"zha" would be more appropriate.  Just to pick two contemporary names
that have ழ in them, we have Azhagiri and Kanimozhi.

-- 
Cheers
~vijay

Gnus should be more complicated.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-21  3:45       ` Vijay Lakshminarayanan
@ 2011-09-21  4:14         ` Jambunathan K
  2011-09-22  2:42           ` Kenichi Handa
  2011-09-22  3:27           ` Vijay Lakshminarayanan
  0 siblings, 2 replies; 21+ messages in thread
From: Jambunathan K @ 2011-09-21  4:14 UTC (permalink / raw)
  To: Vijay Lakshminarayanan; +Cc: 9336

>> (Rationale: This is how it's usually mapped by a layman even outside of
>> ta-itrans)
>
> "zha" would be more appropriate.  Just to pick two contemporary names
> that have ழ in them, we have Azhagiri and Kanimozhi.

(For Kenichi's benefit) The two things that you have cited above are
Tamil people names.

Or 

Are you making the suggestion - "zha" - based on an actual itrans
implementation? Within Emacs, if I type - `zha' - I get `ழ்ஹ' and mapping
`ha' to `ஹ' seems very reasonable to me.

IMO, there seems to be some de-facto or normative standard on how
english sequences are mapped to tamil alphabets (or any given language?)
via itrans. In that case, there is nothing much Emacs can do but follow
the crowd.

I am a layman user, I don't have any prior experience with other
ta-itrans implementations and Kenichi is the expert here. I would be
perfectly OK with a less than perfect mapping as long as I get uniform
experience across a variety of systems (including Emacs).

Jambunathan K.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-20 10:51     ` Jambunathan K
  2011-09-21  3:45       ` Vijay Lakshminarayanan
@ 2011-09-22  2:23       ` Kenichi Handa
  2011-09-23 11:21         ` Jambunathan K
  1 sibling, 1 reply; 21+ messages in thread
From: Kenichi Handa @ 2011-09-22  2:23 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

[-- Attachment #1: Type: text/plain, Size: 818 bytes --]

In article <81y5xj4hcu.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:
>>> 0B99, ங, N^a or ~Na
> Please use N^a. 
[...]
>>> 0B9E, ஞ, JNa or ~na
> Please use JNa. 
[...]
>>> 0BB5, வ, wa or va
> We can use va.
[...]
>>> 0BB3, ள, La or lda
> We can use La.

Ok.

>>> 0BB4, ழ,
> >
> > It seems that the above character can't be input with
> > tamil-itrans.  What key is usually used for that character?

> The patch that you had sent earlier works. Could you please commit those
> changes? 

> Above character has two bindings - "za" or "Ja".
> We can go with "za".  

Ah, I forgot about my patch.

Anyway, please try the attached patch for
leim/quail/indian.el.  How is the result of C-h C-\
tamil-itrans RET?  Is it ok now?

---
Kenichi Handa
handa@m17n.org


[-- Attachment #2: temp.diff --]
[-- Type: text/x-diff, Size: 4359 bytes --]

=== modified file 'leim/quail/indian.el'
--- leim/quail/indian.el	2011-01-26 08:36:39 +0000
+++ leim/quail/indian.el	2011-09-22 02:07:57 +0000
@@ -118,11 +118,138 @@
  indian-mlm-itrans-v5-hash "malayalam-itrans" "Malayalam" "MlmIT"
  "Malayalam transliteration by ITRANS method.")
 
+(defvar quail-tamil-itrans-syllable-table
+  (let ((vowels
+	 '(("அ" nil "a")
+	   ("ஆ" "ா" "A")
+	   ("இ" "ி" "i")
+	   ("ஈ" "ீ" "I")
+	   ("உ" "ு" "u")
+	   ("ஊ" "ூ" "U")
+	   ("எ" "ெ" "e")
+	   ("ஏ" "ே" "E")
+	   ("ஐ" "ை" "ai")
+	   ("ஒ" "ொ" "o")
+	   ("ஓ" "ோ" "O")
+	   ("ஔ" "ௌ" "au")))
+	(consonants
+	 '(("க" "k")			; U+0B95
+	   ("ங" "N^")			; U+0B99
+	   ("ச" "ch")			; U+0B9A
+	   ("ஞ" "JN")			; U+0B9E
+	   ("ட" "T")			; U+0B9F
+	   ("ண" "N")			; U+0BA3
+	   ("த" "t")			; U+0BA4
+	   ("ந" "n")			; U+0BA8
+	   ("ப" "p")			; U+0BAA
+	   ("ம" "m")			; U+0BAE
+	   ("ய" "y")			; U+0BAF
+	   ("ர" "r")			; U+0BB0
+	   ("ல" "l")			; U+0BB2
+	   ("வ" "v")			; U+0BB5
+	   ("ழ" "z")			; U+0BB4
+	   ("ள" "L")			; U+0BB3
+	   ("ற" "rh")			; U+0BB1
+	   ("ன" "nh")			; U+0BA9
+	   ("ஜ" "j")			; U+0B9C
+	   ("ஷ" "Sh")			; U+0BB7
+	   ("ஸ" "s")			; U+0BB8
+	   ("ஹ" "h")			; U+0BB9
+	   ("க்ஷ" "x" )			; U+0B95
+	   ))
+	(virama #x0BCD)
+	clm)
+    (with-temp-buffer
+      (insert "    +")
+      (insert-char ?- 74)
+      (insert "\n    |")
+      (setq clm 6)
+      (dolist (v vowels)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(car v))
+	(setq clm (+ clm 6)))
+      (insert "\n    |")
+      (setq clm 6)
+      (dolist (v vowels)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(nth 2 v))
+	(setq clm (+ clm 6)))
+      (insert "\n----+")
+      (insert-char ?- 74)
+      (insert "\n")
+      (dolist (c consonants)
+	(insert (car c) virama
+		(propertize "\t" 'display '(space :align-to 4))
+		"|")
+	(setq clm 6)
+	(dolist (v vowels)
+	  (insert (propertize "\t" 'display (list 'space :align-to clm))
+		  (car c) (or (nth 1 v) ""))
+	  (setq clm (+ clm 6)))
+	(insert "\n" (nth 1 c)
+		(propertize "\t" 'display '(space :align-to 4))
+		"|")
+	(setq clm 6)
+	(dolist (v vowels)
+	  (insert (propertize "\t" 'display (list 'space :align-to clm))
+		  (nth 1 c) (nth 2 v))
+	  (setq clm (+ clm 6)))
+	(insert "\n"))
+      (insert "----+")
+      (insert-char ?- 74)
+      (insert "\n\n"
+	      "  Ex: To enter வணக்கம், type vaNakkam.\n")
+      (buffer-string))))
+
+(defvar quail-tamil-itrans-misc-table
+  (let ((symbols '((?ஃ . "H") (?ஂ . "M") (?் . ".h")))
+	(digits "௦௧௨௩௪௫௬௭௮௯")
+	clm)
+    (with-temp-buffer
+      (insert "----------+------------------------------\n")
+      (insert " symbols  |            digits            \n")
+      (insert "----------+------------------------------\n")
+      (setq clm 1)
+      (dolist (elm symbols)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(car elm))
+	(setq clm (+ clm 3)))
+      (insert (propertize "\t" 'display '(space :align-to 10)) "|")
+      (setq clm 12)
+      (dotimes (i 10)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(aref digits i))
+	(setq clm (+ clm 3)))
+      (insert "\n")
+      (setq clm 1)
+      (dolist (elm symbols)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(cdr elm))
+	(setq clm (+ clm 3)))
+      (insert (propertize "\t" 'display '(space :align-to 10)) "|")
+      (setq clm 12)
+      (dotimes (i 10)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(format "%d" i))
+	(setq clm (+ clm 3)))
+      (insert "\n----------+------------------------------\n")
+      (buffer-string))))
+
 (if nil
     (quail-define-package "tamil-itrans" "Tamil" "TmlIT" t "Tamil ITRANS"))
 (quail-define-indian-trans-package
  indian-tml-itrans-v5-hash "tamil-itrans" "Tamil" "TmlIT"
- "Tamil transliteration by ITRANS method.")
+ "Tamil transliteration by ITRANS method.
+
+### Basic syllables (consonants + vowels) ###
+
+\\<quail-tamil-itrans-syllable-table>
+
+### Symbols, etc. ###
+
+\\<quail-tamil-itrans-misc-table>
+
+Full key sequences are listed below:")
 
 
 ;;;


^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-21  4:14         ` Jambunathan K
@ 2011-09-22  2:42           ` Kenichi Handa
  2011-09-22  3:27           ` Vijay Lakshminarayanan
  1 sibling, 0 replies; 21+ messages in thread
From: Kenichi Handa @ 2011-09-22  2:42 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

In article <814o068rbu.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:

>>> (Rationale: This is how it's usually mapped by a layman even outside of
>>> ta-itrans)
> >
> > "zha" would be more appropriate.  Just to pick two contemporary names
> > that have ழ in them, we have Azhagiri and Kanimozhi.

> (For Kenichi's benefit) The two things that you have cited above are
> Tamil people names.

Thank you for the info. 

> Are you making the suggestion - "zha" - based on an actual itrans
> implementation? Within Emacs, if I type - `zha' - I get `ழ்ஹ' and mapping
> `ha' to `ஹ' seems very reasonable to me.

> IMO, there seems to be some de-facto or normative standard on how
> english sequences are mapped to tamil alphabets (or any given language?)
> via itrans. In that case, there is nothing much Emacs can do but follow
> the crowd.

> I am a layman user, I don't have any prior experience with other
> ta-itrans implementations and Kenichi is the expert here.

All I know about itrans is that it's originally a method for
roman transliteration of Indic scripts, not an input method.
So, using itrans as an input method may reveal various
shortages/conflicts of the original itrans definition, and
thus we must extend/modify the mapping between keys and
chars.  But, I don't know what kind of defact standard there
are.

For the above case, which is more convenient?
(1) "zha" -> "ழ்ஹ" and "za" -> "ழ"
(2) "zha" -> "ழ", "za" -> "zஅ", and "zhha" -> "ழ்ஹ".

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-21  4:14         ` Jambunathan K
  2011-09-22  2:42           ` Kenichi Handa
@ 2011-09-22  3:27           ` Vijay Lakshminarayanan
  1 sibling, 0 replies; 21+ messages in thread
From: Vijay Lakshminarayanan @ 2011-09-22  3:27 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

Jambunathan K <kjambunathan@gmail.com> writes:

>>> (Rationale: This is how it's usually mapped by a layman even outside of
>>> ta-itrans)
>>
>> "zha" would be more appropriate.  Just to pick two contemporary names
>> that have ழ in them, we have Azhagiri and Kanimozhi.
>
> (For Kenichi's benefit) The two things that you have cited above are
> Tamil people names.
>
> Or 
>
> Are you making the suggestion - "zha" - based on an actual itrans
> implementation? 

I have zero experience with itrans but I do have some experience with
transliteration, per se (see below).

> mapping
> `ha' to `ஹ' seems very reasonable to me.

I agree.  But this is orthogonal to mapping "zh" to "ழ்".  With this
mapping, a user who wishes to write "ழ்ஹ" would merely write "zhha".

> IMO, there seems to be some de-facto or normative standard on how
> english sequences are mapped to tamil alphabets (or any given language?)
> via itrans. In that case, there is nothing much Emacs can do but follow
> the crowd.
>
> I am a layman user, I don't have any prior experience with other
> ta-itrans implementations and Kenichi is the expert here. I would be
> perfectly OK with a less than perfect mapping as long as I get uniform
> experience across a variety of systems (including Emacs).

Here too I agree.  From your earlier post it seemed as if you were
suggesting new combinations for transliteration and that's why I
provided my own.

Using "zh" to represent "ழ்" is quite canonical and common in
comtemporary culture (possibly older, but I'm not sure).  I already
cited two people with "ழ்" in their names transliterated as "zh".  The
recent Tamil movie "மொழி" was also transliterated as "mozhi".

I'm the author of an online transliterator
(http://www.yash.info/indianLanguageConverter/tamil.html) which had some
moderate success when it was originally written (back in 2005).  Its
transliteration scheme is quite different from some of the other
suggestions in this thread.  I felt that only the "z" -- "zh" was strong
enough to change.

> Jambunathan K.
>

-- 
Cheers
~vijay

Gnus should be more complicated.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-22  2:23       ` Kenichi Handa
@ 2011-09-23 11:21         ` Jambunathan K
  2011-09-23 11:24           ` Jambunathan K
  2011-09-26  7:08           ` Kenichi Handa
  0 siblings, 2 replies; 21+ messages in thread
From: Jambunathan K @ 2011-09-23 11:21 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9336

[-- Attachment #1: Type: text/plain, Size: 701 bytes --]


Kenichi

>> Above character has two bindings - "za" or "Ja".
>> We can go with "za".  
>
> Ah, I forgot about my patch.

Yes, there is a pending patch.

> Anyway, please try the attached patch for
> leim/quail/indian.el.  How is the result of C-h C-\
> tamil-itrans RET?  Is it ok now?

The new 2-D table looks pretty good and I am VERY HAPPY with the
changes.

I have made some changes - presentation-wise - on top of your changes. I
have attached 3 screenshots that demonstrates the changes that I have
made [1]. FYI, my FSF copyright assignment number for Emacs is #618390.

I also propose that some additional changes be made to quail.el [2]. I
am unsure how to introduce these changes cleanly.


[-- Attachment #2: indian.el.9336.patch --]
[-- Type: text/x-patch, Size: 6738 bytes --]

--- a/indian.el	2011-09-19 09:06:02.000000000 +0530
+++ c/indian.el	2011-09-23 15:45:35.562500000 +0530
@@ -118,12 +118,188 @@
  indian-mlm-itrans-v5-hash "malayalam-itrans" "Malayalam" "MlmIT"
  "Malayalam transliteration by ITRANS method.")
 
+(defvar quail-tamil-itrans-syllable-table
+  (let ((vowels
+	 '(("அ" nil "a")
+	   ("ஆ" "ா" "A")
+	   ("இ" "ி" "i")
+	   ("ஈ" "ீ" "I")
+	   ("உ" "ு" "u")
+	   ("ஊ" "ூ" "U")
+	   ("எ" "ெ" "e")
+	   ("ஏ" "ே" "E")
+	   ("ஐ" "ை" "ai")
+	   ("ஒ" "ொ" "o")
+	   ("ஓ" "ோ" "O")
+	   ("ஔ" "ௌ" "au")))
+	(consonants
+	 '(("க" "k")			; U+0B95
+	   ("ங" "N^")			; U+0B99
+	   ("ச" "ch")			; U+0B9A
+	   ("ஞ" "JN")			; U+0B9E
+	   ("ட" "T")			; U+0B9F
+	   ("ண" "N")			; U+0BA3
+	   ("த" "t")			; U+0BA4
+	   ("ந" "n")			; U+0BA8
+	   ("ப" "p")			; U+0BAA
+	   ("ம" "m")			; U+0BAE
+	   ("ய" "y")			; U+0BAF
+	   ("ர" "r")			; U+0BB0
+	   ("ல" "l")			; U+0BB2
+	   ("வ" "v")			; U+0BB5
+	   ("ழ" "z")			; U+0BB4
+	   ("ள" "L")			; U+0BB3
+	   ("ற" "rh")			; U+0BB1
+	   ("ன" "nh")			; U+0BA9
+	   ("ஜ" "j")			; U+0B9C
+	   ("ஶ" nil)			; U+0BB6
+	   ("ஷ" "Sh")			; U+0BB7
+	   ("ஸ" "s")			; U+0BB8
+	   ("ஹ" "h")			; U+0BB9
+	   ("க்ஷ" "x" )			; U+0B95
+	   ))
+	(virama #x0BCD)
+	clm)
+    (with-temp-buffer
+      (insert "\n")
+      (insert "    +")
+      (insert-char ?- 74)
+      (insert "\n    |")
+      (setq clm 6)
+      (dolist (v vowels)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(car v))
+	(setq clm (+ clm 6)))
+      (insert "\n    |")
+      (setq clm 6)
+      (dolist (v vowels)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(nth 2 v))
+	(setq clm (+ clm 6)))
+      (dolist (c consonants)
+	(insert "\n----+")
+	(insert-char ?- 74)
+	(insert "\n")
+	(insert (car c) virama
+		(propertize "\t" 'display '(space :align-to 4))
+		"|")
+	(setq clm 6)
+	(dolist (v vowels)
+	  (insert (propertize "\t" 'display (list 'space :align-to clm))
+		  (car c) (or (nth 1 v) ""))
+	  (setq clm (+ clm 6)))
+	(insert "\n" (or (nth 1 c) "")
+		(propertize "\t" 'display '(space :align-to 4))
+		"|")
+	(setq clm 6)
+
+	(dolist (v vowels)
+	  (apply 'insert (propertize "\t" 'display (list 'space :align-to clm))
+		 (if (nth 1 c) (list (nth 1 c) (nth 2 v)) (list "")))
+	  (setq clm (+ clm 6))))
+      (insert "\n")
+      (insert "----+")
+      (insert-char ?- 74)
+      (insert "\n")
+      (buffer-string))))
+
+(defvar quail-tamil-itrans-numerics-and-symbols-table
+  (let ((symbols '((?௳ . "நாள்") (?௴ . "மாதம்") (?௵ . "வருடம்")
+		 (?௶ . "பற்று") (?௷ . "வரவு") (?௸ . "மேற்படி")
+		 (?௹ . "ரூபாய்") (?௺ . "எண்")))
+	(numerics '((?௰ . "பத்து") (?௱ . "நூறு") (?௲ . "ஆயிரம்")))
+	(width 6) clm)
+    (with-temp-buffer
+      (insert "\n" (make-string 18 ?-) "+" (make-string 50 ?-) "\n")
+      (insert
+       (propertize "\t" 'display (list 'space :align-to 5)) "numerics"
+       (propertize "\t" 'display (list 'space :align-to 18)) "|"
+       (propertize "\t" 'display (list 'space :align-to 38)) "symbols")
+      (insert "\n" (make-string 18 ?-) "+" (make-string 50 ?-) "\n")
+      (setq clm 0)
+      (dolist (symbols (list numerics symbols))
+	(unless (eq symbols numerics)
+	  (insert (propertize "\t" 'display (list 'space :align-to clm)) "|")
+	  (setq clm (+ clm width)))
+	(dolist (elm symbols)
+	  (insert (propertize "\t" 'display (list 'space :align-to clm))
+		  (car elm))
+	  (setq clm (+ clm width))))
+      (insert "\n")
+      (setq clm 0)
+      (dolist (symbols (list numerics symbols))
+	(unless (eq symbols numerics)
+	  (insert (propertize "\t" 'display (list 'space :align-to clm)) "|")
+	  (setq clm (+ clm width)))
+	(dolist (elm symbols)
+	  (insert (propertize "\t" 'display (list 'space :align-to clm))
+		  (or (cdr elm) ""))
+	  (setq clm (+ clm width))))
+      (insert "\n" (make-string 18 ?-) "+" (make-string 50 ?-) "\n")
+      (insert "\n")
+      (buffer-string))))
+
+(defvar quail-tamil-itrans-various-signs-and-digits-table
+  (let ((various '((?ஃ . "H") ("ஸ்ரீ" . "srii") (?ௐ)))
+	(digits "௦௧௨௩௪௫௬௭௮௯")
+	(width 6) clm)
+    (with-temp-buffer
+      (insert "\n" (make-string 18 ?-) "+" (make-string 60 ?-) "\n")
+      (insert
+       (propertize "\t" 'display (list 'space :align-to 5)) "various"
+       (propertize "\t" 'display (list 'space :align-to 18)) "|"
+       (propertize "\t" 'display (list 'space :align-to 45)) "digits")
+
+      (insert "\n" (make-string 18 ?-) "+" (make-string 60 ?-) "\n")
+      (setq clm 0 )
+
+      (dotimes (i (length various))
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(car (nth i various)))
+	(setq clm (+ clm width)))
+      (insert (propertize "\t" 'display (list 'space :align-to clm)) "|")
+      (setq clm (+ clm width))
+      (dotimes (i 10)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(aref digits i))
+	(setq clm (+ clm width)))
+      (insert "\n")
+      (setq clm 0)
+      (dotimes (i (length various))
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(or (cdr (nth i various)) ""))
+	(setq clm (+ clm width)))
+      (insert (propertize "\t" 'display (list 'space :align-to clm)) "|")
+      (setq clm (+ clm width))
+      (dotimes (i 10)
+	(insert (propertize "\t" 'display (list 'space :align-to clm))
+		(format "%d" i))
+	(setq clm (+ clm width)))
+      (insert "\n" (make-string 18 ?-) "+" (make-string 60 ?-) "\n")
+      (buffer-string))))
+
 (if nil
     (quail-define-package "tamil-itrans" "Tamil" "TmlIT" t "Tamil ITRANS"))
 (quail-define-indian-trans-package
  indian-tml-itrans-v5-hash "tamil-itrans" "Tamil" "TmlIT"
- "Tamil transliteration by ITRANS method.")
+ "Tamil transliteration by ITRANS method.
+
+You can input characters using the following mapping tables.
+    Example: To enter வணக்கம், type vaNakkam.
+
+### Basic syllables (consonants + vowels) ###
+\\<quail-tamil-itrans-syllable-table>
+
+### Miscellaneous (various signs + digits) ###
+\\<quail-tamil-itrans-various-signs-and-digits-table>
+
+### Others (numerics + symbols) ###
+
+Characters below have no ITRANS method associated with them.
+Their descriptions are included for easy reference.
+\\<quail-tamil-itrans-numerics-and-symbols-table>
 
+Full key sequences are listed below:")
 
 ;;;
 ;;; Input by Inscript

[-- Attachment #3: Type: text/plain, Size: 48 bytes --]


Footnotes: 
[1]  Summary of ta-itrans changes


[-- Attachment #4: ta-itrans-1.png --]
[-- Type: image/png, Size: 25759 bytes --]

[-- Attachment #5: ta-itrans-2.png --]
[-- Type: image/png, Size: 31716 bytes --]

[-- Attachment #6: ta-itrans-3.png --]
[-- Type: image/png, Size: 24823 bytes --]

[-- Attachment #7: Type: text/plain, Size: 2663 bytes --]


+ ta-itrans-1.png
  - I have introduced horizontal grid lines

  - I have added #0BB6 to the table. The character has no input method
    available. So the input row is left empty.

    Note: Furthermore, the associated glyphs are not available. (I have
    Tamil Lohit 2.4.5 - Thanks for suggesting this font!). So the tamil
    characters in that row looks malformed.

+ ta-itrans-2.png
  - I have split the single misc table used in your patch and split it
    in to two tables - "signs and digits table" and "numerics and
    symbols table".

    In the first table, 
    - I have removed Virama (0BCD) and Anusvara (0B82).
    - I have added TAMIL OM (0BD0) and the compound character `srii'.

    In the second table,
    - The characters have no input method. So I have included a
      description in native language to denote the purpose they serve. I
      believe there are advantages to including this table. For example,
      - It completes the catalogue and one can easily find out whether
        once font needs to be upgraded to display these (rarely used)
        characters.
      - Should there be a future request for assigning input methods to
        them, one need not work on it afresh.

[2] Additional hacks to quail.el

+ ta-itrans-3.png
  - Removed the old table (about which I had complaints). I don't know
    how to do this cleanly. Here is a quick hack I cooked up (marked
    with @@@)

,---- In quail.el (around line 2550 or so)
| 	(let* ((decode-map (list 'decode-map))
|                (num (quail-build-decode-map (list (quail-map)) "" decode-map
|                                             ;; We used to use 512 here, but
|                                             ;; TeX has more than 1000 and
|                                             ;; it's good to see the list.
|                                             0 5120 done-list)))
|@@@@@ 	  (when (and nil (> num 0)) 
| 	    (insert "
| KEY SEQUENCE
| ------------
| ")
| 	    (if (quail-show-layout)
| 		(insert "You can also input more characters")
| 	      (insert "You can input characters"))
| 	    (insert " by the following key sequences:\n")
| 	    (quail-insert-decode-map decode-map)))
`----

  - Place the cursor at the BEGINNING of help buffer rather than the end
    of it. There are more interesting things at the beginning of the
    buffer than the end of the buffer.

,---- In quail.el (around line 2575)
|       ;; Resize the help window again, now that it has all its contents.
|       (save-selected-window
|@@@ 	(goto-char (point-min))
|  	(select-window (get-buffer-window (current-buffer) t))
| 	(run-hooks 'temp-buffer-show-hook))
`----


^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-23 11:21         ` Jambunathan K
@ 2011-09-23 11:24           ` Jambunathan K
  2011-09-26  7:08           ` Kenichi Handa
  1 sibling, 0 replies; 21+ messages in thread
From: Jambunathan K @ 2011-09-23 11:24 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9336


> [2. text/x-patch; indian.el.9336.patch]...

Forgot to mention this. The patch is on top of the emacs-24 trunk.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-23 11:21         ` Jambunathan K
  2011-09-23 11:24           ` Jambunathan K
@ 2011-09-26  7:08           ` Kenichi Handa
  1 sibling, 0 replies; 21+ messages in thread
From: Kenichi Handa @ 2011-09-26  7:08 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

[-- Attachment #1: Type: text/plain, Size: 2502 bytes --]

In article <817h4zcxm0.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:

> > Ah, I forgot about my patch.

> Yes, there is a pending patch.

I've just asked the Emacs maintainers about the patch.  As
we are already in a feature freeze mode, it's upto the
maintainers whether to include the patch in the next release
or not.

> I have made some changes - presentation-wise - on top of your changes. I
> have attached 3 screenshots that demonstrates the changes that I have
> made [1]. FYI, my FSF copyright assignment number for Emacs is #618390.

With your patch, the table of the third section (Others)
doesn't fit in 80-column with my font setting.  So, I
slightly changed the :align-to property.  Please check the
attached version (full leim/quail/indian.el) with your
environment.

By the way,

> + ta-itrans-1.png
>   - I have introduced horizontal grid lines

>   - I have added #0BB6 to the table. The character has no input method
>     available. So the input row is left empty.

>     Note: Furthermore, the associated glyphs are not available. (I have
>     Tamil Lohit 2.4.5 - Thanks for suggesting this font!). So the tamil
>     characters in that row looks malformed.

That's strange.  My Tamil Lohit font is Version 2.4.4 (older
than yours), and it displays the line for #0BB6 correctly.

> [2] Additional hacks to quail.el

> + ta-itrans-3.png
>   - Removed the old table (about which I had complaints). I don't know
>     how to do this cleanly. Here is a quick hack I cooked up (marked
>     with @@@)

The old table (full key sequence) contains the keys that are
not shown in the above tables; i.e. such alternate keys as
"aa", "~Na".  Are they really not necessary.

And, even if not necessary, the current code doesn't have a
mechanism to suppress it, and adding such a mechanism should
not be done now.  I'll put that matter in my todo list.

>   - Place the cursor at the BEGINNING of help buffer rather than the end
>     of it. There are more interesting things at the beginning of the
>     buffer than the end of the buffer.

> ,---- In quail.el (around line 2575)
> |       ;; Resize the help window again, now that it has all its contents.
> |       (save-selected-window
> |@@@ 	(goto-char (point-min))
> |  	(select-window (get-buffer-window (current-buffer) t))
> | 	(run-hooks 'temp-buffer-show-hook))
> `----

I don't understand why this is necessary.  Doesn't C-h C-\
tamil-itrans RET shows the top of *Help* buffer?

---
Kenichi Handa
handa@m17n.org

[-- Attachment #2: indian.el --]
[-- Type: application/emacs-lisp, Size: 16388 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-08-21 16:47 bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans Jambunathan K
  2011-08-23  4:18 ` Kenichi Handa
  2011-09-15 12:55 ` Jambunathan K
@ 2011-09-26 13:53 ` Jambunathan K
  2011-09-27  4:50   ` Kenichi Handa
  2011-09-28 13:00 ` Jambunathan K
  3 siblings, 1 reply; 21+ messages in thread
From: Jambunathan K @ 2011-09-26 13:53 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9336


> With your patch, the table of the third section (Others)
> doesn't fit in 80-column with my font setting.  So, I
> slightly changed the :align-to property.  Please check the
> attached version (full leim/quail/indian.el) with your
> environment.

Your changes look good with my setup. Feel free to close this bug once
you commit these changes.

> By the way,
>
>> + ta-itrans-1.png
>>   - I have introduced horizontal grid lines
>
>>   - I have added #0BB6 to the table. The character has no input method
>>     available. So the input row is left empty.
>
>>     Note: Furthermore, the associated glyphs are not available. (I have
>>     Tamil Lohit 2.4.5 - Thanks for suggesting this font!). So the tamil
>>     characters in that row looks malformed.
>
> That's strange.  My Tamil Lohit font is Version 2.4.4 (older
> than yours), and it displays the line for #0BB6 correctly.

Do you see heart-shaped strcutures in that line much similar to what you
see in ta-itrans-1.png that I attached earlier? Then it means that the
display is malformed.

If you compare that line with the adjoining ones, then you will see that
the heart-shaped structure is a place-holder for the consonant symbol
and the "redundant" consonant symbol itself has to be removed.

>> [2] Additional hacks to quail.el
>
>> + ta-itrans-3.png
>>   - Removed the old table (about which I had complaints). I don't know
>>     how to do this cleanly. Here is a quick hack I cooked up (marked
>>     with @@@)
>
> The old table (full key sequence) contains the keys that are
> not shown in the above tables; i.e. such alternate keys as
> "aa", "~Na".  Are they really not necessary.

I see. It wasn't clear to me that they were displaying alternative
keys. I think we can maintain status quo.

> And, even if not necessary, the current code doesn't have a
> mechanism to suppress it, and adding such a mechanism should
> not be done now.  I'll put that matter in my todo list.

Ok.

>>   - Place the cursor at the BEGINNING of help buffer rather than the end
>>     of it. There are more interesting things at the beginning of the
>>     buffer than the end of the buffer.
>
>> ,---- In quail.el (around line 2575)
>> |       ;; Resize the help window again, now that it has all its contents.
>> |       (save-selected-window
>> |@@@ 	(goto-char (point-min))
>> |  	(select-window (get-buffer-window (current-buffer) t))
>> | 	(run-hooks 'temp-buffer-show-hook))
>> `----
>
> I don't understand why this is necessary.  Doesn't C-h C-\
> tamil-itrans RET shows the top of *Help* buffer?

I have this in .emacs.

(custom-set-variables
 '(pop-up-windows nil))

With the above setting, cursor always ends up in the tail of the buffer.

If I leave the above variable at it's factory value then the cursor ends
up in the head of the buffer. 

I consider this behaviour buggy. Should I open a separate bug for this?

Jambunathan K.

> ---
> Kenichi Handa
> handa@m17n.org





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-26 13:53 ` Jambunathan K
@ 2011-09-27  4:50   ` Kenichi Handa
  0 siblings, 0 replies; 21+ messages in thread
From: Kenichi Handa @ 2011-09-27  4:50 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]

In article <81sjnjif5m.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:

> > With your patch, the table of the third section (Others)
> > doesn't fit in 80-column with my font setting.  So, I
> > slightly changed the :align-to property.  Please check the
> > attached version (full leim/quail/indian.el) with your
> > environment.

> Your changes look good with my setup. Feel free to close this bug once
> you commit these changes.

I think we have not yet settled this issue:
  "z" or "zh" for ழ

> > That's strange.  My Tamil Lohit font is Version 2.4.4 (older
> > than yours), and it displays the line for #0BB6 correctly.

> Do you see heart-shaped strcutures in that line much similar to what you
> see in ta-itrans-1.png that I attached earlier?

No, I attach my image file.  I think that heart-shaped glyph
is actually U+25CC (DOTTED CIRCLE) used to display an
isolated combining character.  Which version of m17n-db are
you using?

> > I don't understand why this is necessary.  Doesn't C-h C-\
> > tamil-itrans RET shows the top of *Help* buffer?

> I have this in .emacs.

> (custom-set-variables
>  '(pop-up-windows nil))

> With the above setting, cursor always ends up in the tail of the buffer.

> If I leave the above variable at it's factory value then the cursor ends
> up in the head of the buffer. 

> I consider this behaviour buggy. Should I open a separate bug for this?

Ah, I see.  It is better to open a separate bug.

---
Kenichi Handa
handa@m17n.org


[-- Attachment #2: temp.png --]
[-- Type: image/png, Size: 31421 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-08-21 16:47 bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans Jambunathan K
                   ` (2 preceding siblings ...)
  2011-09-26 13:53 ` Jambunathan K
@ 2011-09-28 13:00 ` Jambunathan K
  2011-09-29  1:46   ` Kenichi Handa
  2011-09-29  1:51   ` Kenichi Handa
  3 siblings, 2 replies; 21+ messages in thread
From: Jambunathan K @ 2011-09-28 13:00 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9336

> I think we have not yet settled this issue:
>   "z" or "zh" for ழ

My recommendation is that we don't make a one-off change. 

I am largely OK with `z' and `J' offered by you primarily because I see
that entry listed in the table linked below

http://www.aczoom.com/itrans/html/tamil/node5.html [1]

I see that ta-itrans mappings in Emacs is largely in agreement with the
mappings in the above table save for few characters [2]. So my vote is
for aligning Emacs's table with AC's ITRANS tables. I concur with
Vijay's comment that the choices made in AC's table are less than
perfect [3].

My little research, suggests that that there are multiple contemporary
and competing implementations for ta-itrans. Under these circumstances,
it is important that the Emacs' implementation stick to just a single
system. It would be wonderful if we could get in touch with the original
author of ta-itrans and understand the source of Emacs's mappings.

When time permits, I will spend some time looking at some authoritative
websites and try to form an opinion [4].

>> > That's strange.  My Tamil Lohit font is Version 2.4.4 (older
>> > than yours), and it displays the line for #0BB6 correctly.
>
>> Do you see heart-shaped strcutures in that line much similar to what you
>> see in ta-itrans-1.png that I attached earlier?
>
> No, I attach my image file.  I think that heart-shaped glyph
> is actually U+25CC (DOTTED CIRCLE) used to display an
> isolated combining character.  Which version of m17n-db are
> you using?

I am on Windows XP. So I believe the equivalent of m17n-db is
uniscribe. 

For the sake of record, the following messages suggest that #0BB6 is not
properly rendered on Windows XP. It is apparently fixed on Windows
Vista.

http://unicode.org/mail-arch/unicode-ml/y2007-m07/0004.html
http://unicode.org/mail-arch/unicode-ml/y2007-m07/0005.html

Jambunathan K.

Footnotes: 

[1]  This page seems to me to be near authoritative as far as ITRANS is
concerned and is often cited in articles of any note. 

For example, the Wikipedia article 
- http://en.wikipedia.org/wiki/ITRANS

and The Indian Institute of Technology, Madras article 
- http://acharya.iitm.ac.in/multi_sys/transli/schemes.php

[2] Comparison between Emacs and AC's ITRANS tables. Minor differences
are marked with `x' and major differences are marked with `xx'. In
effect there are only 5 major differences.

character       Emacs           AC's ITRANS             Difference
அ               a               a
ஆ               A or aa         A or AA
இ               i               i
ஈ               I or ii         I or ii
உ               u               u
ஊ               U or uu         U or uu
எ               e               e
ஏ               E               E
ஐ               ai              ai
ஒ               o               o
ஓ               O               O
ஔ               au              au
ஃ               H               q                       xx
ஸ்ரீ              SRI             srii                    xx

க்               k               k or g                  x
ங்               N^ or ~N        N^ or ~N
ச்               ch              ch
ஞ்               JN or ~n        ~n                      x
ட்               T               T or Th                 x
ண்               N               N
த்               t               t or th                 x
ந்               n               n
ப்               p               p or b                  x
ம்               m               m
ய்               y               y
ர்               r               r
ல்               l               l
வ்               v or w          v or w
ழ்               z or J          z or J
ள்               L or ld         L                       x
ற்               rh              R                       xx
ன்               nh              ^n                      xx
ஜ்               j               j
ஶ்                              sh                      xx
ஷ்               Sh or shh       Sh                      x
ஸ்               s               s
ஹ்               h               h
க்ஷ்              x               x or ksha               x

[3] Emacs being customizable, I hope it allows for users to tweak the
stock ITRANS table to suit his taste. For example, it should be possible
for one to create custom ITRANS table based on the table at the bottom
of this page.

http://www.yash.info/indianLanguageConverter/tamil.html

(The above table was pointed to by Vijay)

Personally I find some of the mappings in Vijay's table are less than
satisfactory to my tastes.

[4] For example,

"International Forum for Information Technology in Tamil"
- http://infitt.org/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-28 13:00 ` Jambunathan K
@ 2011-09-29  1:46   ` Kenichi Handa
  2011-09-29  1:51   ` Kenichi Handa
  1 sibling, 0 replies; 21+ messages in thread
From: Kenichi Handa @ 2011-09-29  1:46 UTC (permalink / raw)
  To: Jambunathan K; +Cc: 9336

In article <814nzwom7o.fsf@gmail.com>, Jambunathan K <kjambunathan@gmail.com> writes:

> > I think we have not yet settled this issue:
> >   "z" or "zh" for ழ

> My recommendation is that we don't make a one-off change. 

> I am largely OK with `z' and `J' offered by you primarily because I see
> that entry listed in the table linked below

> http://www.aczoom.com/itrans/html/tamil/node5.html [1]

> I see that ta-itrans mappings in Emacs is largely in agreement with the
> mappings in the above table save for few characters [2]. So my vote is
> for aligning Emacs's table with AC's ITRANS tables. I concur with
> Vijay's comment that the choices made in AC's table are less than
> perfect [3].

> My little research, suggests that that there are multiple contemporary
> and competing implementations for ta-itrans. Under these circumstances,
> it is important that the Emacs' implementation stick to just a single
> system. It would be wonderful if we could get in touch with the original
> author of ta-itrans and understand the source of Emacs's mappings.

> When time permits, I will spend some time looking at some authoritative
> websites and try to form an opinion [4].

Ok, I agree.  So, I'll close this bug.

>>> > That's strange.  My Tamil Lohit font is Version 2.4.4 (older
>>> > than yours), and it displays the line for #0BB6 correctly.
> >
>>> Do you see heart-shaped strcutures in that line much similar to what you
>>> see in ta-itrans-1.png that I attached earlier?
> >
> > No, I attach my image file.  I think that heart-shaped glyph
> > is actually U+25CC (DOTTED CIRCLE) used to display an
> > isolated combining character.  Which version of m17n-db are
> > you using?

> I am on Windows XP. So I believe the equivalent of m17n-db is
> uniscribe. 

> For the sake of record, the following messages suggest that #0BB6 is not
> properly rendered on Windows XP. It is apparently fixed on Windows
> Vista.

> http://unicode.org/mail-arch/unicode-ml/y2007-m07/0004.html
> http://unicode.org/mail-arch/unicode-ml/y2007-m07/0005.html

Thank you for the investigation.  So, this is not an Emacs bug.

> [3] Emacs being customizable, I hope it allows for users to tweak the
> stock ITRANS table to suit his taste. For example, it should be possible
> for one to create custom ITRANS table based on the table at the bottom
> of this page.

Ok, I'll put it in my todo list, but at the moment, the
priority is low.  And I think it requires considerable
time/work.

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans
  2011-09-28 13:00 ` Jambunathan K
  2011-09-29  1:46   ` Kenichi Handa
@ 2011-09-29  1:51   ` Kenichi Handa
  1 sibling, 0 replies; 21+ messages in thread
From: Kenichi Handa @ 2011-09-29  1:51 UTC (permalink / raw)
  To: 9336-done







^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2011-09-29  1:51 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-21 16:47 bug#9336: 24.0.50; No way to input character #xbb4 using ta-itrans Jambunathan K
2011-08-23  4:18 ` Kenichi Handa
2011-08-23  7:23   ` Jambunathan K
2011-08-23  7:55     ` Jambunathan K
2011-08-25  4:27     ` Kenichi Handa
2011-09-15 12:55 ` Jambunathan K
2011-09-16  7:26   ` Kenichi Handa
2011-09-20 10:51     ` Jambunathan K
2011-09-21  3:45       ` Vijay Lakshminarayanan
2011-09-21  4:14         ` Jambunathan K
2011-09-22  2:42           ` Kenichi Handa
2011-09-22  3:27           ` Vijay Lakshminarayanan
2011-09-22  2:23       ` Kenichi Handa
2011-09-23 11:21         ` Jambunathan K
2011-09-23 11:24           ` Jambunathan K
2011-09-26  7:08           ` Kenichi Handa
2011-09-26 13:53 ` Jambunathan K
2011-09-27  4:50   ` Kenichi Handa
2011-09-28 13:00 ` Jambunathan K
2011-09-29  1:46   ` Kenichi Handa
2011-09-29  1:51   ` Kenichi Handa

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.