* bug#56323: 29.0.50; Add new customisable phonetic Tamil input method
@ 2022-06-30 12:13 Visuwesh
2022-06-30 14:08 ` Visuwesh
` (2 more replies)
0 siblings, 3 replies; 42+ messages in thread
From: Visuwesh @ 2022-06-30 12:13 UTC (permalink / raw)
To: 56323
[-- Attachment #1: Type: text/plain, Size: 4373 bytes --]
Tags: patch
The attached patchset adds a new customisable phonetic Tamil input
method. I tried to reuse as much of the existing itrans input method
code since it greatly simplifies the creation of an Indic input method
(see `indian-make-hash').
The first patch fixes a fallout from bug#50143 asking to add TAMIL OM ௐ
to the itrans table, and this means that one can insert the TAMIL OM
character using the tamil-itrans input methods as well. I'd prefer it
if this patch can be pushed quickly.
The second patch actually adds the new phonetic input method. I will
leave the rationale for making it a _customisable_ input method in
footnote [1]. To reuse the existing code that calculates the various
tables for the tamil-itrans IM, I turned the code in defvars to defuns.
However, the definition of the almighty
quail-tamil-itrans-syllable-table is still huge since I needed to do a
whole lot to convert the indian-tml-base-table to a format that will
accepted by the new defun `quail-tamil-itrans-compute-syllable-table'.
The current quail rules is inspired by the one in
https://github.com/rnchzn/tamil-phonetic/raw/main/tamil-phonetic.el and
the comments in
https://emacsnotes.wordpress.com/2022/03/07/tamil-phonetic-input-method-in-emacs-emacs-%E0%AE%87%E0%AE%B2%E0%AF%8D-%E0%AE%A4%E0%AE%AE%E0%AE%BF%E0%AE%B4%E0%AF%8D-%E0%AE%83%E0%AE%AA%E0%AF%8A%E0%AE%A9%E0%AF%86%E0%AE%9F%E0%AE%BF%E0%AE%95%E0%AF%8D/.
Avid readers might notice that I went for a nil SIMPLE argument despite
my recent complaint in emacs-devel. The reason for that is because we
need a way to end the ongoing translation (C-SPC). E.g., if one decides
to transliterate ல் as "l" and ள் as "ll", then to type ல்ல the key
sequence will be
l C-SPC la
without the C-SPC, "lla" would be translated to ள. The better way
forward would be to present _both_ ல்ல and ள் for the sequence "lla" but I
have no idea how to do it. Any pointers would be _highly_ appreciated.
I plan to modify indian--puthash-char to have one to many translations
i.e., "l" would translate to both ல் and ள் and then the user could decide
which one to insert. This combined with the DETERMINISTIC argument to
quail-define-package would make it an attractive option, I think. But
I'm leaving it out right now since I want the current patch to be
reviewed first.
I think adding an optional NAME argument to tamil--update-quail-rules
might be more flexible since then a user could let bind the relevant
defcustoms to define other Tamil input methods without hassle (like the
tamil99 layout, which I plan to get to at Some Point™). WDYT?
The code for tamil--update-quail-rules is sort of convoluted because of
the conversion mentioned above. tamil--make-trans-table is also kind of
complicated because,
1. I couldn't make the tamil-vowel-translation (and consonant, and
misc) alist have a character key since the Customize interface
shows those characters as numbers!! I really do not want to dig
into the Customize UI code, sorry. :(
2. indian-tml-base-table has the character க in it but the defcustom
tamil-consonant-translation has the character க் in it because the
latter makes more sense to a native speaker and also because of
(1) above. More explanation as to why in footnote [2].
There are some FIXMEs spattered in the code but I will get to it in a
later revision. I also don't have a :set function for the defcustoms
since I'm not sure if something along the following is the only way to
automagically recalculate the quail rules:
(defun tamil--set-variable (sym val)
(set-default sym val)
(when (and (boundp 'tamil-vowel-translation)
(boundp 'tamil-consonant-translation)
(boundp 'tamil-misc-translation)
(boundp 'tamil-native-digits))
(tamil--update-quail-rules)))
Comments on this, and general code review would be much appreciated.
I don't think I have missed anything and if you want me to add more
comments on some of the stuff, please do tell. Thanks.
If Tamil speakers are reading this bug report, shout at me if you want
something else and if you have other general comments. Or if I made an
embarrassing typo somewhere. Thanks!
[-- Attachment #2: 0001-Fix-fallout-from-bug-50143.patch --]
[-- Type: text/x-diff, Size: 1700 bytes --]
From 8789592426031a2608ef74c22719ed826be0353c Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Thu, 30 Jun 2022 16:49:31 +0530
Subject: [PATCH 1/2] Fix fallout from bug#50143
* lisp/language/ind-util.el (indian-tml-base-table)
(indian-tml-base-digits-table): Add TAMIL OM sign to the table
(bug#50143), and add more Sanskrit consonants.
---
lisp/language/ind-util.el | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/lisp/language/ind-util.el b/lisp/language/ind-util.el
index 60ada03fa2..c725134a30 100644
--- a/lisp/language/ind-util.el
+++ b/lisp/language/ind-util.el
@@ -269,9 +269,9 @@ indian-tml-base-table
?ய ?ர ?ற ?ல ?ள ?ழ ?வ ;; SEMIVOWELS
nil ?ஷ ?ஸ ?ஹ ;; SIBILANTS
nil nil nil nil nil nil nil nil ;; NUKTAS
- "ஜ்ஞ" "க்ஷ")
+ "ஜ்ஞ" "க்ஷ" "க்ஷ" ?ஶ)
(;; Misc Symbols
- nil ?ஂ ?ஃ nil ?் nil nil)
+ nil ?ஂ ?ஃ nil ?் ?ௐ nil)
(;; Digits
nil nil nil nil nil nil nil nil nil nil)
(;; Inscript-extra (4) (#, $, ^, *, ])
@@ -292,9 +292,9 @@ indian-tml-base-digits-table
?ய ?ர ?ற ?ல ?ள ?ழ ?வ ;; SEMIVOWELS
nil ?ஷ ?ஸ ?ஹ ;; SIBILANTS
nil nil nil nil nil nil nil nil ;; NUKTAS
- "ஜ்ஞ" "க்ஷ")
+ "ஜ்ஞ" "க்ஷ" "க்ஷ" ?ஶ)
(;; Misc Symbols
- nil ?ஂ ?ஃ nil ?் nil nil)
+ nil ?ஂ ?ஃ nil ?் ?ௐ nil)
(;; Digits
?௦ ?௧ ?௨ ?௩ ?௪ ?௫ ?௬ ?௭ ?௮ ?௯)
(;; Inscript-extra (4) (#, $, ^, *, ])
--
2.35.1
[-- Attachment #3: 0002-Add-new-customizable-phonetic-Tamil-input-method.patch --]
[-- Type: text/x-diff, Size: 16350 bytes --]
From 7ddd045ff337b100b30bc27db38b490877c3e1ea Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Thu, 30 Jun 2022 17:01:07 +0530
Subject: [PATCH 2/2] Add new customizable phonetic Tamil input method
* lisp/leim/quail/indian.el
(quail-tamil-itrans-compute-syllable-table): New function extracted
from..
(quail-tamil-itrans-syllable-table): ... here. Use above function.
(quail-tamil-itrans-compute-signs-table): Add new argument VARIOUS.
(quail-tamil-itrans-various-signs-and-digits-table)
(quail-tamil-itrans-various-signs-table): Adjust function call, and
add TAMIL OM sign translation.
(tamil): New phonetic Tamil input method.
(tamil-vowel-translation, tamil-consonant-translation)
(tamil-misc-translation, tamil-native-digits): New defcustoms to
change the translation rules of the input method.
(tamil-uyir-translation, tamil-mei-translation): Aliases to new
defcustom for better discoverability.
(tamil--syllable-table, tamil--signs-table, tamil--hashtables)
(tamil--vowel-signs): Internal variables used by the Tamil input
method.
(tamil--make-trans-table): Function to produce an itrans compatible
translation table.
(tamil--update-quail-rules): Function to update the translation rules
for the Tamil input method.
* lisp/language/indian.el ("Tamil"): Change the default input method
of the Tamil language environment to the phonetic input method.
---
lisp/language/indian.el | 2 +-
lisp/leim/quail/indian.el | 276 +++++++++++++++++++++++++++++---------
2 files changed, 211 insertions(+), 67 deletions(-)
diff --git a/lisp/language/indian.el b/lisp/language/indian.el
index 2887d410ad..91ad818533 100644
--- a/lisp/language/indian.el
+++ b/lisp/language/indian.el
@@ -109,7 +109,7 @@ 'devanagari
"Tamil" '((charset unicode)
(coding-system utf-8)
(coding-priority utf-8)
- (input-method . "tamil-itrans")
+ (input-method . "tamil")
(sample-text . "Tamil (தமிழ்) வணக்கம்")
(documentation . "\
South Indian Language Tamil is supported in this language environment."))
diff --git a/lisp/leim/quail/indian.el b/lisp/leim/quail/indian.el
index 8fffcc3511..120a446b44 100644
--- a/lisp/leim/quail/indian.el
+++ b/lisp/leim/quail/indian.el
@@ -127,47 +127,18 @@ "\\''"
indian-mlm-itrans-v5-hash "malayalam-itrans" "Malayalam" "MlmIT"
"Malayalam transliteration by ITRANS method.")
-(defvar quail-tamil-itrans-syllable-table
- (let ((vowels
- '(("அ" nil "a")
- ("ஆ" "ா" "A")
- ("இ" "ி" "i")
- ("ஈ" "ீ" "I")
- ("உ" "ு" "u")
- ("ஊ" "ூ" "U")
- ("எ" "ெ" "e")
- ("ஏ" "ே" "E")
- ("ஐ" "ை" "ai")
- ("ஒ" "ொ" "o")
- ("ஓ" "ோ" "O")
- ("ஔ" "ௌ" "au")))
- (consonants
- '(("க" "k") ; U+0B95
- ("ங" "N^") ; U+0B99
- ("ச" "ch") ; U+0B9A
- ("ஞ" "JN") ; U+0B9E
- ("ட" "T") ; U+0B9F
- ("ண" "N") ; U+0BA3
- ("த" "t") ; U+0BA4
- ("ந" "n") ; U+0BA8
- ("ப" "p") ; U+0BAA
- ("ம" "m") ; U+0BAE
- ("ய" "y") ; U+0BAF
- ("ர" "r") ; U+0BB0
- ("ல" "l") ; U+0BB2
- ("வ" "v") ; U+0BB5
- ("ழ" "z") ; U+0BB4
- ("ள" "L") ; U+0BB3
- ("ற" "rh") ; U+0BB1
- ("ன" "nh") ; U+0BA9
- ("ஜ" "j") ; U+0B9C
- ("ஶ" nil) ; U+0BB6
- ("ஷ" "Sh") ; U+0BB7
- ("ஸ" "s") ; U+0BB8
- ("ஹ" "h") ; U+0BB9
- ("க்ஷ" "x" ) ; U+0B95
- ))
- (virama #x0BCD)
+;; FIXME: This only accepts a single translation for vowels. Ideally,
+;; we want it to support mutliple translation just like consonants.
+(defun quail-tamil-itrans-compute-syllable-table (vowels consonants)
+ "Return the syllable table for the input method as a string.
+VOWELS is a list of (VOWEL SIGN TRANS) where VOWEL is a string or
+character representing the Tamil vowel character, SIGN is the
+vowel sign corresponding to VOWEL or nil for none, and TRANS is
+the input sequence to insert VOWEL.
+CONSONANTS is a list of (CONSONANT TRANS...) where CONSONANT is
+the Tamil consonant character, and TRANS is one or more strings
+that describe how to insert CONSONANT."
+ (let ((virama #x0BCD)
clm)
(with-temp-buffer
(insert "\n")
@@ -197,21 +168,42 @@ quail-tamil-itrans-syllable-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car c) (or (nth 1 v) ""))
(setq clm (+ clm 6)))
- (insert "\n" (or (nth 1 c) "")
- (propertize "\t" 'display '(space :align-to 4))
- "|")
- (setq clm 6)
-
- (dolist (v vowels)
- (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
- (if (nth 1 c) (list (nth 1 c) (nth 2 v)) (list "")))
- (setq clm (+ clm 6))))
+ (dolist (ct (cdr c))
+ (insert "\n" (or ct "")
+ (propertize "\t" 'display '(space :align-to 4))
+ "|")
+ (setq clm 6)
+ (dolist (v vowels)
+ (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
+ (if ct (list ct (nth 2 v)) (list "")))
+ (setq clm (+ clm 6)))))
(insert "\n")
(insert "----+")
(insert-char ?- 74)
(insert "\n")
(buffer-string))))
+(defvar quail-tamil-itrans-syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (let ((vowels (car indian-tml-base-table))
+ trans v ret)
+ (dotimes (i (length vowels))
+ (when (setq v (nth i vowels))
+ (setq trans (nth i (car indian-itrans-v5-table-for-tamil)))
+ (push (append v (list (if (listp trans) (car trans) trans)))
+ ret)))
+ (setq ret (nreverse ret))
+ ret)
+ (let ((consonants (cadr indian-tml-base-table))
+ trans c ret)
+ (dotimes (i (length consonants))
+ (when (setq c (nth i consonants))
+ (setq trans (nth i (cadr indian-itrans-v5-table-for-tamil)))
+ (push (cons c (if (listp trans) trans (list trans)))
+ ret)))
+ (setq ret (nreverse ret))
+ ret)))
+
(defvar quail-tamil-itrans-numerics-and-symbols-table
(let ((numerics '((?௰ "பத்து") (?௱ "நூறு") (?௲ "ஆயிரம்")))
(symbols '((?௳ "நாள்") (?௴ "மாதம்") (?௵ "வருடம்")
@@ -244,25 +236,28 @@ quail-tamil-itrans-numerics-and-symbols-table
(insert "\n")
(buffer-string))))
-(defun quail-tamil-itrans-compute-signs-table (digitp)
+(defun quail-tamil-itrans-compute-signs-table (digitp various)
"Compute the signs table for the tamil-itrans input method.
-If DIGITP is non-nil, include the digits translation as well."
- (let ((various '((?ஃ . "H") ("ஸ்ரீ" . "srii") (?ௐ)))
- (digits "௦௧௨௩௪௫௬௭௮௯")
+If DIGITP is non-nil, include the digits translation as well.
+If VARIOUS is non-nil, then it should a list of (CHAR . TRANS)
+where CHAR is the character/string to translate and TRANS is
+CHAR's translation."
+ (let ((digits "௦௧௨௩௪௫௬௭௮௯")
(width 6) clm)
(with-temp-buffer
- (insert "\n" (make-string 18 ?-) "+")
- (when digitp (insert (make-string 60 ?-)))
+ (insert "\n" (make-string 18 ?-))
+ (when digitp
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(insert
(propertize "\t" 'display '(space :align-to 5)) "various"
- (propertize "\t" 'display '(space :align-to 18)) "|")
+ (propertize "\t" 'display '(space :align-to 18)))
(when digitp
(insert
- (propertize "\t" 'display '(space :align-to 45)) "digits"))
- (insert "\n" (make-string 18 ?-) "+")
+ "|" (propertize "\t" 'display '(space :align-to 45)) "digits"))
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-)))
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(setq clm 0)
@@ -270,7 +265,8 @@ quail-tamil-itrans-compute-signs-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car (nth i various)))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
@@ -283,23 +279,26 @@ quail-tamil-itrans-compute-signs-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(or (cdr (nth i various)) ""))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
(insert (propertize "\t" 'display (list 'space :align-to clm))
(format "%d" i))
(setq clm (+ clm width))))
- (insert "\n" (make-string 18 ?-) "+")
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-) "\n"))
+ (insert "+" (make-string 60 ?-) "\n"))
(buffer-string))))
(defvar quail-tamil-itrans-various-signs-and-digits-table
- (quail-tamil-itrans-compute-signs-table t))
+ (quail-tamil-itrans-compute-signs-table
+ t '((?ஃ . "H") ("ஸ்ரீ" . "srii") (?ௐ . "OM"))))
(defvar quail-tamil-itrans-various-signs-table
- (quail-tamil-itrans-compute-signs-table nil))
+ (quail-tamil-itrans-compute-signs-table
+ nil '((?ஃ . "H") ("ஸ்ரீ" . "srii") (?ௐ . "OM"))))
(if nil
(quail-define-package "tamil-itrans" "Tamil" "TmlIT" t "Tamil ITRANS"))
@@ -347,6 +346,151 @@ quail-tamil-itrans-various-signs-table
Full key sequences are listed below:")
+;;;
+;;; Tamil phonetic input method
+;;;
+
+(defvaralias 'tamil-uyir-translation 'tamil-vowel-translation)
+(defcustom tamil-vowel-translation
+ '(("அ" "a") ("ஆ" "aa") ("இ" "i") ("ஈ" "ii")
+ ("உ" "u") ("ஊ" "uu") ("எ" "e") ("ஏ" "ee")
+ ("ஐ" "ai") ("ஒ" "o") ("ஓ" "oo") ("ஔ" "au" "ow"))
+ "List of input sequences to translate to Tamil vowels.
+Each element should be (VOWEL . TRANSLATIONS) where VOWEL is the
+Tamil vowel character (உயிரெழுத்து) and TRANSLATIONS is the
+list of input sequences to translate to that vowel."
+ :group 'leim
+ :type '(alist :key string :value-type (repeat string))
+ :options (delq nil
+ (mapcar (lambda (x) (and (consp x) (string (car x))))
+ (car indian-tml-base-table))))
+
+(defvaralias 'tamil-mei-translation 'tamil-consonant-translation)
+(defcustom tamil-consonant-translation
+ '(("க்" "k" "g") ("ங்" "ng") ("ச்" "ch" "s") ("ஞ்" "nj") ("ட்" "t" "d")
+ ("ண்" "N") ("த்" "th" "dh") ("ந்" "nh") ("ப்" "p" "b") ("ம்" "m")
+ ("ய்" "y") ("ர்" "r") ("ல்" "l") ("வ்" "v") ("ழ்" "z" "zh")
+ ("ள்" "L") ("ற்" "rh") ("ன்" "n")
+ ;; Sanskrit.
+ ("ஜ்" "j") ("ஸ்" "S") ("க்ஷ்" "ksH") ("ஷ்" "sh") ("ஹ்" "h")
+ ("க்ஷ்" "ksh") ("ஶ்" "Z"))
+ "List of input sequences to translate to Tamil consonants.
+Each element should be (VOWEL . TRANSLATIONS) where VOWEL is the
+Tamil consonant character (மெய் எழுத்து) and TRANSLATIONS is a list
+of input sequences to translate to that consonant."
+ :group 'leim
+ :type '(alist :key string :value-type (repeat string))
+ :options (delq nil
+ (mapcar (lambda (x) (if (stringp x)
+ (concat x "்")
+ ;; #x0BCD = pulli/virama.
+ (and x (string x #x0BCD))))
+ (cadr indian-tml-base-table))))
+
+(defcustom tamil-misc-translation
+ ;; ஃ is not a vowel or a consonant.
+ '(("ஃ" "F" "q")
+ ("ௐ" "OM"))
+ "List of input sequences to translate to various Tamil characters.
+Each should element should be (CHARACTER . TRANSLATIONS) where
+CHARACTER may be any string and TRANSLATIONS is a list of input
+sequences to translate to that CHARACTER."
+ :group 'leim
+ :type '(alist :key string :value-type (repeat string)))
+
+(defcustom tamil-native-digits nil
+ "When non-nil, use Tamil native digits instead of Arabic ones."
+ :group 'leim
+ :type 'boolean)
+
+(defvar tamil--syllable-table nil)
+(defvar tamil--signs-table nil)
+(defvar tamil--hashtables nil)
+(defvar tamil--vowel-signs
+ '(("அ" . nil) ("ஆ" . ?ா) ("இ" . ?ி) ("ஈ" . ?ீ)
+ ("உ" . ?ு) ("ஊ" . ?ூ) ("எ" . ?ெ) ("ஏ" . ?ே)
+ ("ஐ" . ?ை) ("ஓ" . ?ோ) ("ஒ" . ?ொ) ("ஔ" . ?ௌ)))
+
+(defun tamil--make-trans-table ()
+ `((;; Vowels.
+ ,@(mapcar
+ (lambda (v) (assoc-default (and v (string (car v))) tamil-vowel-translation))
+ (car indian-tml-base-table)))
+ (;; Consonants.
+ ,@(mapcar
+ (lambda (c)
+ (when c
+ (assoc-default (if (stringp c)
+ (concat c "்")
+ ;; #x0BCD = pulli/virama.
+ (string c #x0BCD))
+ tamil-consonant-translation)))
+ (nth 1 indian-tml-base-table)))
+ ;; Misc symbols. We will ignore the base table here.
+ ()))
+
+(defun tamil--update-quail-rules ()
+ (let ((quail-current-package (assoc "tamil" quail-package-alist))
+ (hts (indian-make-hash (if tamil-native-digits
+ indian-tml-base-digits-table
+ indian-tml-base-table)
+ (tamil--make-trans-table))))
+ ;; Do the misc characters here.
+ (indian--puthash-m (mapcar #'car tamil-misc-translation)
+ (mapcar #'cdr tamil-misc-translation)
+ hts)
+ (setq tamil--hashtables hts
+ tamil--syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (mapcar (lambda (v) (list (car v) (assoc-default (car v) tamil--vowel-signs)
+ (cadr v)))
+ tamil-vowel-translation)
+ (mapcar (lambda (c) (cons (substring (car c) 0 -1) (cdr c)))
+ tamil-consonant-translation))
+ tamil--signs-table
+ ;; FIXME: This should also show how to input ஸ்ரீ (default: Srii).
+ (quail-tamil-itrans-compute-signs-table
+ tamil-native-digits
+ (mapcar (lambda (m) (cons (car m) (cadr m))) tamil-misc-translation)))
+ ;; (nth 2 ...) = quail-map.
+ (setf (nth 2 quail-current-package) '(nil))
+ (maphash (lambda (k v)
+ (quail-defrule k (if (length= v 1)
+ (string-to-char v)
+ (vector v))))
+ (cdr hts))))
+
+(quail-define-package
+ "tamil" "Tamil" "ழ" t
+ "Customisable Tamil phonetic input method.
+To change the translation of vowels (உயிரெத்துக்கள்), customize `tamil-vowel-translation'.
+To change the translation of consonants (மெய் எழுத்துக்கள்), customize
+ `tamil-consonant-translation'.
+To input miscellaneous characters (including ஃ), customize
+ `tamil-misc-translation'.
+To use native Tamil digits, customize `tamil-native-digits'.
+
+To end the current translation process, say \\<quail-translation-keymap>\\[quail-select-current] (defined in
+`quail-translation-keymap'). This is useful when there's a
+ conflict between two possible translation.
+
+The current input scheme is:
+
+### Basic syllables (உயிர்மெய் எழுத்துக்கள்) ###
+\\=\\<tamil--syllable-table>
+
+### Miscellaneous ####
+\\=\\<tamil--signs-table>
+
+The following characters have NO input sequence associated with
+them by default. Their descriptions are included for easy
+reference.
+\\=\\<quail-tamil-itrans-numerics-and-symbols-table>
+
+Full key sequences are listed below:"
+ nil nil nil nil nil nil t)
+(tamil--update-quail-rules)
+
;;;
;;; Input by Inscript
;;;
--
2.35.1
[-- Attachment #4: Type: text/plain, Size: 1862 bytes --]
---
Footnotes:
1. The itrans input method is absolutely horrible for Tamil since unlike
the other Indic languages, it doesn't have a lot of consonants
HOWEVER, the consonant sound _changes_ depending on where it ends up.
So ideally, the Tamil input method show allow multiple _ways_ to
insert a single character. As an example, consider the following
words
தும்பிக்கை - thumbikai (tusk)
படம் - padam (photograph/image)
The consonant of interest is "ப". The letter "பி" is pronounced in
the first word as "bi" as in "bicycle" however, the letter "ப" is
pronounced as "pa" as in "party". This is just one of many
examples.
There are also pairs of very similar sounding consonants and when
transliterated (when you type in "Tanglish" for example), all the
characters in the pair use the same letter. E.g., such a pair is
the ல/ள family; when one causally chats in "Tanglish", we just type
"lXX" as the transliteration for that family. Obviously, when one
is typing in _Tamil_, he/she needs to distinguish between these two
characters. Leaving the choice of input sequence to transliterate
these characters to the writer is much better. For more, please
read the wordpress article I linked, thanks.
2. Opting to not go for character key in tamil-consonant-translation
because of the Customize interface is only part of the reason.
Having the key be TAMIL LETTER XXX + TAMIL SIGN VIRAMA is much more
intuitive for the native speaker. Take பு for example, the way you
break it down into consonant and vowel is
ப் + உ = பு
(ippu + u = pu)
and NOT
ப + உ = பு
(pa + u = pu)
^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; Add new customisable phonetic Tamil input method
2022-06-30 12:13 bug#56323: 29.0.50; Add new customisable phonetic Tamil input method Visuwesh
@ 2022-06-30 14:08 ` Visuwesh
2022-06-30 15:53 ` Visuwesh
2022-07-01 12:59 ` bug#56323: 29.0.50; [v2] " Visuwesh
2 siblings, 0 replies; 42+ messages in thread
From: Visuwesh @ 2022-06-30 14:08 UTC (permalink / raw)
To: 56323
[-- Attachment #1: Type: text/plain, Size: 623 bytes --]
[வியாழன் ஜூன் 30, 2022] Visuwesh wrote:
> Tags: patch
>
> The attached patchset adds a new customisable phonetic Tamil input
> method. I tried to reuse as much of the existing itrans input method
> code since it greatly simplifies the creation of an Indic input method
> (see `indian-make-hash').
>
> The first patch fixes a fallout from bug#50143 asking to add TAMIL OM ௐ
> to the itrans table, and this means that one can insert the TAMIL OM
> character using the tamil-itrans input methods as well. I'd prefer it
> if this patch can be pushed quickly.
This should be better:
[-- Attachment #2: 0001-Fix-fallout-from-bug-50143.patch --]
[-- Type: text/x-diff, Size: 2056 bytes --]
From 35a75604f23ee28e6f0c69ae540c6bf6598cca1a Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Thu, 30 Jun 2022 19:36:41 +0530
Subject: [PATCH] Fix fallout from bug#50143
* lisp/language/ind-util.el (indian-tml-base-table)
(indian-tml-base-digits-table): Add TAMIL OM sign and more Sanskrit
consonants to the table (bug#50143) (bug#56323).
---
lisp/language/ind-util.el | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/lisp/language/ind-util.el b/lisp/language/ind-util.el
index 60ada03fa2..fa380dbde7 100644
--- a/lisp/language/ind-util.el
+++ b/lisp/language/ind-util.el
@@ -267,11 +267,11 @@ indian-tml-base-table
?த nil nil nil ?ந ?ன ;; DENTALS
?ப nil nil nil ?ம ;; LABIALS
?ய ?ர ?ற ?ல ?ள ?ழ ?வ ;; SEMIVOWELS
- nil ?ஷ ?ஸ ?ஹ ;; SIBILANTS
+ ?ஶ ?ஷ ?ஸ ?ஹ ;; SIBILANTS
nil nil nil nil nil nil nil nil ;; NUKTAS
- "ஜ்ஞ" "க்ஷ")
+ "ஜ்ஞ" "க்ஷ" "க்ஷ்")
(;; Misc Symbols
- nil ?ஂ ?ஃ nil ?் nil nil)
+ nil ?ஂ ?ஃ nil ?் ?ௐ nil)
(;; Digits
nil nil nil nil nil nil nil nil nil nil)
(;; Inscript-extra (4) (#, $, ^, *, ])
@@ -290,11 +290,11 @@ indian-tml-base-digits-table
?த nil nil nil ?ந ?ன ;; DENTALS
?ப nil nil nil ?ம ;; LABIALS
?ய ?ர ?ற ?ல ?ள ?ழ ?வ ;; SEMIVOWELS
- nil ?ஷ ?ஸ ?ஹ ;; SIBILANTS
+ ?ஶ ?ஷ ?ஸ ?ஹ ;; SIBILANTS
nil nil nil nil nil nil nil nil ;; NUKTAS
- "ஜ்ஞ" "க்ஷ")
+ "ஜ்ஞ" "க்ஷ" "க்ஷ்")
(;; Misc Symbols
- nil ?ஂ ?ஃ nil ?் nil nil)
+ nil ?ஂ ?ஃ nil ?் ?ௐ nil)
(;; Digits
?௦ ?௧ ?௨ ?௩ ?௪ ?௫ ?௬ ?௭ ?௮ ?௯)
(;; Inscript-extra (4) (#, $, ^, *, ])
--
2.35.1
[-- Attachment #3: Type: text/plain, Size: 200 bytes --]
[ Ref. https://www.aczoom.com/itrans/online/; insert "sh" and compare
the character that shows up in the Sanskrit panel and the Tamil panel
(you have to change the language in another panel). ]
^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; Add new customisable phonetic Tamil input method
2022-06-30 12:13 bug#56323: 29.0.50; Add new customisable phonetic Tamil input method Visuwesh
2022-06-30 14:08 ` Visuwesh
@ 2022-06-30 15:53 ` Visuwesh
2022-07-01 12:59 ` bug#56323: 29.0.50; [v2] " Visuwesh
2 siblings, 0 replies; 42+ messages in thread
From: Visuwesh @ 2022-06-30 15:53 UTC (permalink / raw)
To: 56323
[வியாழன் ஜூன் 30, 2022] Visuwesh wrote:
> 1. The itrans input method is absolutely horrible for Tamil since unlike
> the other Indic languages, it doesn't have a lot of consonants
> HOWEVER, the consonant sound _changes_ depending on where it ends up.
> So ideally, the Tamil input method show allow multiple _ways_ to
> insert a single character. As an example, consider the following
> words
>
> தும்பிக்கை - thumbikai (tusk)
^^^^^
I meant trunk, ofc.
As is usual, I keep messing up translations.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-06-30 12:13 bug#56323: 29.0.50; Add new customisable phonetic Tamil input method Visuwesh
2022-06-30 14:08 ` Visuwesh
2022-06-30 15:53 ` Visuwesh
@ 2022-07-01 12:59 ` Visuwesh
2022-07-01 13:01 ` Visuwesh
2022-07-01 13:22 ` Eli Zaretskii
2 siblings, 2 replies; 42+ messages in thread
From: Visuwesh @ 2022-07-01 12:59 UTC (permalink / raw)
To: 56323
[-- Attachment #1: Type: text/plain, Size: 693 bytes --]
[வியாழன் ஜூன் 30, 2022] Visuwesh wrote:
> The second patch actually adds the new phonetic input method. I will
> leave the rationale for making it a _customisable_ input method in
> footnote [1]. To reuse the existing code that calculates the various
> tables for the tamil-itrans IM, I turned the code in defvars to defuns.
> However, the definition of the almighty
> quail-tamil-itrans-syllable-table is still huge since I needed to do a
> whole lot to convert the indian-tml-base-table to a format that will
> accepted by the new defun `quail-tamil-itrans-compute-syllable-table'.
> [blah blah blah...]
Here's a second revision of the second patch.
[-- Attachment #2: 0001-Add-new-customizable-phonetic-Tamil-input-method.patch --]
[-- Type: text/x-diff, Size: 17608 bytes --]
From 232e58dc10cce110870697c85721411822d6de64 Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Thu, 30 Jun 2022 17:01:07 +0530
Subject: [PATCH] Add new customizable phonetic Tamil input method
* lisp/leim/quail/indian.el
(quail-tamil-itrans-compute-syllable-table): New function extracted
from..
(quail-tamil-itrans-syllable-table): ... here. Use above function.
(quail-tamil-itrans-compute-signs-table): Add new argument VARIOUS.
(quail-tamil-itrans-various-signs-and-digits-table)
(quail-tamil-itrans-various-signs-table): Adjust function call, and
add TAMIL OM sign translation.
(tamil): New phonetic Tamil input method.
(tamil-vowel-translation, tamil-consonant-translation)
(tamil-misc-translation, tamil-native-digits): New defcustoms to
change the translation rules of the input method.
(tamil-uyir-translation, tamil-mei-translation): Aliases to new
defcustom for better discoverability.
(tamil--syllable-table, tamil--signs-table, tamil--hashtables)
(tamil--vowel-signs): Internal variables used by the Tamil input
method.
(tamil--make-tables): Function to produce vowels, consonants, and
their translations.
(tamil--update-quail-rules): Function to update the translation rules
for the Tamil input method.
* lisp/language/indian.el ("Tamil"): Change the default input method
of the Tamil language environment to the phonetic input method.
---
lisp/language/indian.el | 2 +-
lisp/leim/quail/indian.el | 301 +++++++++++++++++++++++++++++---------
2 files changed, 235 insertions(+), 68 deletions(-)
diff --git a/lisp/language/indian.el b/lisp/language/indian.el
index 2887d410ad..91ad818533 100644
--- a/lisp/language/indian.el
+++ b/lisp/language/indian.el
@@ -109,7 +109,7 @@ 'devanagari
"Tamil" '((charset unicode)
(coding-system utf-8)
(coding-priority utf-8)
- (input-method . "tamil-itrans")
+ (input-method . "tamil")
(sample-text . "Tamil (தமிழ்) வணக்கம்")
(documentation . "\
South Indian Language Tamil is supported in this language environment."))
diff --git a/lisp/leim/quail/indian.el b/lisp/leim/quail/indian.el
index 8fffcc3511..d7d33fb68f 100644
--- a/lisp/leim/quail/indian.el
+++ b/lisp/leim/quail/indian.el
@@ -127,47 +127,19 @@ "\\''"
indian-mlm-itrans-v5-hash "malayalam-itrans" "Malayalam" "MlmIT"
"Malayalam transliteration by ITRANS method.")
-(defvar quail-tamil-itrans-syllable-table
- (let ((vowels
- '(("அ" nil "a")
- ("ஆ" "ா" "A")
- ("இ" "ி" "i")
- ("ஈ" "ீ" "I")
- ("உ" "ு" "u")
- ("ஊ" "ூ" "U")
- ("எ" "ெ" "e")
- ("ஏ" "ே" "E")
- ("ஐ" "ை" "ai")
- ("ஒ" "ொ" "o")
- ("ஓ" "ோ" "O")
- ("ஔ" "ௌ" "au")))
- (consonants
- '(("க" "k") ; U+0B95
- ("ங" "N^") ; U+0B99
- ("ச" "ch") ; U+0B9A
- ("ஞ" "JN") ; U+0B9E
- ("ட" "T") ; U+0B9F
- ("ண" "N") ; U+0BA3
- ("த" "t") ; U+0BA4
- ("ந" "n") ; U+0BA8
- ("ப" "p") ; U+0BAA
- ("ம" "m") ; U+0BAE
- ("ய" "y") ; U+0BAF
- ("ர" "r") ; U+0BB0
- ("ல" "l") ; U+0BB2
- ("வ" "v") ; U+0BB5
- ("ழ" "z") ; U+0BB4
- ("ள" "L") ; U+0BB3
- ("ற" "rh") ; U+0BB1
- ("ன" "nh") ; U+0BA9
- ("ஜ" "j") ; U+0B9C
- ("ஶ" nil) ; U+0BB6
- ("ஷ" "Sh") ; U+0BB7
- ("ஸ" "s") ; U+0BB8
- ("ஹ" "h") ; U+0BB9
- ("க்ஷ" "x" ) ; U+0B95
- ))
- (virama #x0BCD)
+;; FIXME: This only accepts a single translation for vowels. Ideally,
+;; we want it to support mutliple translation just like consonants.
+;; This also does not sort by vowel or consonant. :(
+(defun quail-tamil-itrans-compute-syllable-table (vowels consonants)
+ "Return the syllable table for the input method as a string.
+VOWELS is a list of (VOWEL SIGN TRANS) where VOWEL is a string or
+character representing the Tamil vowel character, SIGN is the
+vowel sign corresponding to VOWEL or nil for none, and TRANS is
+the input sequence to insert VOWEL.
+CONSONANTS is a list of (CONSONANT TRANS...) where CONSONANT is
+the Tamil consonant character, and TRANS is one or more strings
+that describe how to insert CONSONANT."
+ (let ((virama #x0BCD)
clm)
(with-temp-buffer
(insert "\n")
@@ -197,21 +169,42 @@ quail-tamil-itrans-syllable-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car c) (or (nth 1 v) ""))
(setq clm (+ clm 6)))
- (insert "\n" (or (nth 1 c) "")
- (propertize "\t" 'display '(space :align-to 4))
- "|")
- (setq clm 6)
-
- (dolist (v vowels)
- (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
- (if (nth 1 c) (list (nth 1 c) (nth 2 v)) (list "")))
- (setq clm (+ clm 6))))
+ (dolist (ct (cdr c))
+ (insert "\n" (or ct "")
+ (propertize "\t" 'display '(space :align-to 4))
+ "|")
+ (setq clm 6)
+ (dolist (v vowels)
+ (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
+ (if ct (list ct (nth 2 v)) (list "")))
+ (setq clm (+ clm 6)))))
(insert "\n")
(insert "----+")
(insert-char ?- 74)
(insert "\n")
(buffer-string))))
+(defvar quail-tamil-itrans-syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (let ((vowels (car indian-tml-base-table))
+ trans v ret)
+ (dotimes (i (length vowels))
+ (when (setq v (nth i vowels))
+ (setq trans (nth i (car indian-itrans-v5-table-for-tamil)))
+ (push (append v (list (if (listp trans) (car trans) trans)))
+ ret)))
+ (setq ret (nreverse ret))
+ ret)
+ (let ((consonants (cadr indian-tml-base-table))
+ trans c ret)
+ (dotimes (i (length consonants))
+ (when (setq c (nth i consonants))
+ (setq trans (nth i (cadr indian-itrans-v5-table-for-tamil)))
+ (push (cons c (if (listp trans) trans (list trans)))
+ ret)))
+ (setq ret (nreverse ret))
+ ret)))
+
(defvar quail-tamil-itrans-numerics-and-symbols-table
(let ((numerics '((?௰ "பத்து") (?௱ "நூறு") (?௲ "ஆயிரம்")))
(symbols '((?௳ "நாள்") (?௴ "மாதம்") (?௵ "வருடம்")
@@ -244,25 +237,28 @@ quail-tamil-itrans-numerics-and-symbols-table
(insert "\n")
(buffer-string))))
-(defun quail-tamil-itrans-compute-signs-table (digitp)
+(defun quail-tamil-itrans-compute-signs-table (digitp various)
"Compute the signs table for the tamil-itrans input method.
-If DIGITP is non-nil, include the digits translation as well."
- (let ((various '((?ஃ . "H") ("ஸ்ரீ" . "srii") (?ௐ)))
- (digits "௦௧௨௩௪௫௬௭௮௯")
+If DIGITP is non-nil, include the digits translation as well.
+If VARIOUS is non-nil, then it should a list of (CHAR TRANS)
+where CHAR is the character/string to translate and TRANS is
+CHAR's translation."
+ (let ((digits "௦௧௨௩௪௫௬௭௮௯")
(width 6) clm)
(with-temp-buffer
- (insert "\n" (make-string 18 ?-) "+")
- (when digitp (insert (make-string 60 ?-)))
+ (insert "\n" (make-string 18 ?-))
+ (when digitp
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(insert
(propertize "\t" 'display '(space :align-to 5)) "various"
- (propertize "\t" 'display '(space :align-to 18)) "|")
+ (propertize "\t" 'display '(space :align-to 18)))
(when digitp
(insert
- (propertize "\t" 'display '(space :align-to 45)) "digits"))
- (insert "\n" (make-string 18 ?-) "+")
+ "|" (propertize "\t" 'display '(space :align-to 45)) "digits"))
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-)))
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(setq clm 0)
@@ -270,7 +266,8 @@ quail-tamil-itrans-compute-signs-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car (nth i various)))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
@@ -281,25 +278,28 @@ quail-tamil-itrans-compute-signs-table
(setq clm 0)
(dotimes (i (length various))
(insert (propertize "\t" 'display (list 'space :align-to clm))
- (or (cdr (nth i various)) ""))
+ (or (cadr (nth i various)) ""))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
(insert (propertize "\t" 'display (list 'space :align-to clm))
(format "%d" i))
(setq clm (+ clm width))))
- (insert "\n" (make-string 18 ?-) "+")
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-) "\n"))
+ (insert "+" (make-string 60 ?-) "\n"))
(buffer-string))))
(defvar quail-tamil-itrans-various-signs-and-digits-table
- (quail-tamil-itrans-compute-signs-table t))
+ (quail-tamil-itrans-compute-signs-table
+ t '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(defvar quail-tamil-itrans-various-signs-table
- (quail-tamil-itrans-compute-signs-table nil))
+ (quail-tamil-itrans-compute-signs-table
+ nil '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(if nil
(quail-define-package "tamil-itrans" "Tamil" "TmlIT" t "Tamil ITRANS"))
@@ -347,6 +347,173 @@ quail-tamil-itrans-various-signs-table
Full key sequences are listed below:")
+;;;
+;;; Tamil phonetic input method
+;;;
+
+(defvaralias 'tamil-uyir-translation 'tamil-vowel-translation)
+(defcustom tamil-vowel-translation
+ '(("அ" "a") ("ஆ" "aa") ("இ" "i") ("ஈ" "ii")
+ ("உ" "u") ("ஊ" "uu") ("எ" "e") ("ஏ" "ee")
+ ("ஐ" "ai") ("ஒ" "o") ("ஓ" "oo") ("ஔ" "au" "ow"))
+ "List of input sequences to translate to Tamil vowels.
+Each element should be (VOWEL . TRANSLATIONS) where VOWEL is the
+Tamil vowel character (உயிரெழுத்து) and TRANSLATIONS is the
+list of input sequences to translate to that vowel."
+ :group 'leim
+ :version "29.1"
+ :type '(alist :key string :value-type (repeat string))
+ :options (delq nil
+ (mapcar (lambda (x) (and (consp x) (string (car x))))
+ (car indian-tml-base-table))))
+
+(defvaralias 'tamil-mei-translation 'tamil-consonant-translation)
+(defcustom tamil-consonant-translation
+ '(("க்" "k" "g") ("ங்" "ng") ("ச்" "ch" "s") ("ஞ்" "nj") ("ட்" "t" "d")
+ ("ண்" "N") ("த்" "th" "dh") ("ந்" "nh") ("ப்" "p" "b") ("ம்" "m")
+ ("ய்" "y") ("ர்" "r") ("ல்" "l") ("வ்" "v") ("ழ்" "z" "zh")
+ ("ள்" "L") ("ற்" "rh") ("ன்" "n")
+ ;; Sanskrit.
+ ("ஜ்" "j") ("ஸ்" "S") ("க்ஷ்" "ksH") ("ஷ்" "sh") ("ஹ்" "h")
+ ("க்ஷ்" "ksh") ("ஶ்" "Z"))
+ "List of input sequences to translate to Tamil consonants.
+Each element should be (VOWEL . TRANSLATIONS) where VOWEL is the
+Tamil consonant character (மெய் எழுத்து) and TRANSLATIONS is a list
+of input sequences to translate to that consonant."
+ :group 'leim
+ :version "29.1"
+ :type '(alist :key string :value-type (repeat string))
+ :options (delq nil
+ (mapcar (lambda (x) (if (stringp x)
+ (concat x "்")
+ ;; #x0BCD = pulli/virama.
+ (and x (string x #x0BCD))))
+ (cadr indian-tml-base-table))))
+
+(defcustom tamil-misc-translation
+ ;; ஃ is not a vowel or a consonant.
+ '(("ஃ" "F" "q")
+ ("ௐ" "OM"))
+ "List of input sequences to translate to various Tamil characters.
+Each should element should be (CHARACTER . TRANSLATIONS) where
+CHARACTER may be any string and TRANSLATIONS is a list of input
+sequences to translate to that CHARACTER."
+ :group 'leim
+ :version "29.1"
+ :type '(alist :key string :value-type (repeat string)))
+
+(defcustom tamil-native-digits nil
+ "When non-nil, use Tamil native digits instead of Arabic ones."
+ :group 'leim
+ :version "29.1"
+ :type 'boolean)
+
+(defvar tamil--syllable-table nil)
+(defvar tamil--signs-table nil)
+(defvar tamil--hashtables
+ (cons (make-hash-table :test #'equal)
+ (make-hash-table :test #'equal)))
+(defvar tamil--vowel-signs
+ '(("அ" . nil) ("ஆ" . ?ா) ("இ" . ?ி) ("ஈ" . ?ீ)
+ ("உ" . ?ு) ("ஊ" . ?ூ) ("எ" . ?ெ) ("ஏ" . ?ே)
+ ("ஐ" . ?ை) ("ஒ" . ?ொ) ("ஓ" . ?ோ) ("ஔ" . ?ௌ)))
+
+(defun tamil--make-tables ()
+ "Return vowels, consonants, and their translation rules.
+The returned table is a list (VOWELS CONSONANTS VTRANS CTRANS)
+where VOWELS is a list of (VOWEL VOWEL-SIGN) where VOWEL is the
+Tamil vowel character, and VOWEL-SIGN is its sign, CONSONANTS is
+a list of consonants that have translation rules.
+VTRANS is a list of translation rules for vowels in VOWELS in the
+order they appear, CTRANS is a list of translation rules for
+consonants in CONSONANTS."
+ (let (c-table v-table v-trans c-trans)
+ (dolist (v tamil-vowel-translation)
+ (push (list (car v) (assoc-default (car v) tamil--vowel-signs)) v-table)
+ (push (cdr v) v-trans))
+ (dolist (c tamil-consonant-translation)
+ (push (cdr c) c-trans)
+ ;; Remove pulli/virama from consonant entry.
+ (push (substring (car c) 0 -1) c-table))
+ ;; FIXME: Remove when sorting is done properly!
+ (setq c-table (nreverse c-table)
+ v-table (nreverse v-table)
+ c-trans (nreverse c-trans)
+ v-trans (nreverse v-trans))
+ (list v-table c-table
+ v-trans c-trans)))
+
+(defun tamil--update-quail-rules ()
+ ;; This function does pretty much what `indian-make-hash' does
+ ;; except that we don't try to copy the structure of
+ ;; `indian-tml-base-table' which leads to less code hassle.
+ (let* ((tables (tamil--make-tables))
+ (vowels (nth 0 tables))
+ (vowels-trans (nth 2 tables))
+ (consonants (nth 1 tables))
+ (consonants-trans (nth 3 tables))
+ (pulli (string #x0BCD)))
+ (clrhash (car tamil--hashtables))
+ (clrhash (cdr tamil--hashtables))
+ (indian--puthash-v vowels vowels-trans tamil--hashtables)
+ (indian--puthash-c consonants consonants-trans pulli tamil--hashtables)
+ (indian--puthash-cv consonants consonants-trans
+ vowels vowels-trans tamil--hashtables)
+ (indian--puthash-m (mapcar #'car tamil-misc-translation)
+ (mapcar #'cdr tamil-misc-translation)
+ tamil--hashtables)
+ (when tamil-native-digits
+ (indian--puthash-m (nth 3 indian-tml-base-digits-table)
+ '("0" "1" "2" "3" "4" "5" "6" "7" "8" "9")
+ tamil--hashtables))
+ ;; Now override the current translation rules.
+ ;; Empty quail map is '(list nil)'.
+ (setf (nth 2 quail-current-package) '(()))
+ (maphash (lambda (k v)
+ (quail-defrule k (if (length= v 1)
+ (string-to-char v)
+ (vector v))))
+ (cdr tamil--hashtables))
+ (setq tamil--syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (mapcar (lambda (v) (append v (pop vowels-trans))) vowels)
+ (mapcar (lambda (c) (cons c (pop consonants-trans))) consonants))
+ tamil--signs-table
+ ;; FIXME: This should also show how to input ஸ்ரீ (default: Srii).
+ (quail-tamil-itrans-compute-signs-table
+ tamil-native-digits tamil-misc-translation))))
+
+(quail-define-package
+ "tamil" "Tamil" "ழ" t
+ "Customisable Tamil phonetic input method.
+To change the translation of vowels (உயிரெத்துக்கள்), customize `tamil-vowel-translation'.
+To change the translation of consonants (மெய் எழுத்துக்கள்), customize
+ `tamil-consonant-translation'.
+To input miscellaneous characters (including ஃ), customize
+ `tamil-misc-translation'.
+To use native Tamil digits, customize `tamil-native-digits'.
+
+To end the current translation process, say \\<quail-translation-keymap>\\[quail-select-current] (defined in
+`quail-translation-keymap'). This is useful when there's a
+ conflict between two possible translation.
+
+The current input scheme is:
+
+### Basic syllables (உயிர்மெய் எழுத்துக்கள்) ###
+\\=\\<tamil--syllable-table>
+
+### Miscellaneous ####
+\\=\\<tamil--signs-table>
+
+The following characters have NO input sequence associated with
+them by default. Their descriptions are included for easy
+reference.
+\\=\\<quail-tamil-itrans-numerics-and-symbols-table>
+
+Full key sequences are listed below:"
+ nil nil nil nil nil nil t)
+(tamil--update-quail-rules)
+
;;;
;;; Input by Inscript
;;;
--
2.35.1
[-- Attachment #3: Type: text/plain, Size: 1535 bytes --]
I still haven't added a :set function yet since I'm not sure if there's
a way to avoid the chain of boundp checks.
In this revision, I simplified the code a tiny bit wrt calculating the
translation table since I no longer use the indian-make-hash function
but call whatever functions it call directly in
tamil--update-quail-rules: this greatly reduces the amount of massaging
that needs to be done.
Also, can someone guide me to write a sort function for
quail-tamil-itrans-compute-syllable-table please? The ideal order of
consonants should be the same as the one in the default value of
tamil-consonant-translation, same for tamil-vowel-translation. I tried
the following
(sort (reverse (mapcar #'car tamil-consonant-translation))
(lambda (x y) (let ((lx (length x))
(ly (length y)))
(if (= lx ly) (string-lessp x y) (< lx ly)))))
but that definitely doesn't do what I want. The idea was to sort the
list so that the basic consonants (க் ங் ச் etc.) first then the composite
ones (க்ஷ் க்ஷ் etc.) but `string-lessp' does not even sort the basic
consonants in the right order (the right order being the order in the
default value of `tamil-consonant-translation').
Can I use the min-width property in buffer text? I'm not sure if it was
finished since I remember some discussion surrounding that it wasn't
quite finished yet. I would like to try to use it for syllable table
and friends.
^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 12:59 ` bug#56323: 29.0.50; [v2] " Visuwesh
@ 2022-07-01 13:01 ` Visuwesh
2022-07-01 13:22 ` Eli Zaretskii
1 sibling, 0 replies; 42+ messages in thread
From: Visuwesh @ 2022-07-01 13:01 UTC (permalink / raw)
To: 56323
[-- Attachment #1: Type: text/plain, Size: 836 bytes --]
[வெள்ளி ஜூலை 01, 2022] Visuwesh wrote:
> [வியாழன் ஜூன் 30, 2022] Visuwesh wrote:
>
>> The second patch actually adds the new phonetic input method. I will
>> leave the rationale for making it a _customisable_ input method in
>> footnote [1]. To reuse the existing code that calculates the various
>> tables for the tamil-itrans IM, I turned the code in defvars to defuns.
>> However, the definition of the almighty
>> quail-tamil-itrans-syllable-table is still huge since I needed to do a
>> whole lot to convert the indian-tml-base-table to a format that will
>> accepted by the new defun `quail-tamil-itrans-compute-syllable-table'.
>> [blah blah blah...]
>
> Here's a second revision of the second patch.
>
Here's a corrected patch with a really silly oversight fixed:
[-- Attachment #2: 0001-Add-new-customizable-phonetic-Tamil-input-method.patch --]
[-- Type: text/x-diff, Size: 17687 bytes --]
From 8774f2f3dc1c5850e9ec6f6b0c178fd455221f17 Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Thu, 30 Jun 2022 17:01:07 +0530
Subject: [PATCH] Add new customizable phonetic Tamil input method
* lisp/leim/quail/indian.el
(quail-tamil-itrans-compute-syllable-table): New function extracted
from..
(quail-tamil-itrans-syllable-table): ... here. Use above function.
(quail-tamil-itrans-compute-signs-table): Add new argument VARIOUS.
(quail-tamil-itrans-various-signs-and-digits-table)
(quail-tamil-itrans-various-signs-table): Adjust function call, and
add TAMIL OM sign translation.
(tamil): New phonetic Tamil input method.
(tamil-vowel-translation, tamil-consonant-translation)
(tamil-misc-translation, tamil-native-digits): New defcustoms to
change the translation rules of the input method.
(tamil-uyir-translation, tamil-mei-translation): Aliases to new
defcustom for better discoverability.
(tamil--syllable-table, tamil--signs-table, tamil--hashtables)
(tamil--vowel-signs): Internal variables used by the Tamil input
method.
(tamil--make-tables): Function to produce vowels, consonants, and
their translations.
(tamil--update-quail-rules): Function to update the translation rules
for the Tamil input method.
* lisp/language/indian.el ("Tamil"): Change the default input method
of the Tamil language environment to the phonetic input method.
---
lisp/language/indian.el | 2 +-
lisp/leim/quail/indian.el | 302 +++++++++++++++++++++++++++++---------
2 files changed, 236 insertions(+), 68 deletions(-)
diff --git a/lisp/language/indian.el b/lisp/language/indian.el
index 2887d410ad..91ad818533 100644
--- a/lisp/language/indian.el
+++ b/lisp/language/indian.el
@@ -109,7 +109,7 @@ 'devanagari
"Tamil" '((charset unicode)
(coding-system utf-8)
(coding-priority utf-8)
- (input-method . "tamil-itrans")
+ (input-method . "tamil")
(sample-text . "Tamil (தமிழ்) வணக்கம்")
(documentation . "\
South Indian Language Tamil is supported in this language environment."))
diff --git a/lisp/leim/quail/indian.el b/lisp/leim/quail/indian.el
index 8fffcc3511..ebc04c518b 100644
--- a/lisp/leim/quail/indian.el
+++ b/lisp/leim/quail/indian.el
@@ -127,47 +127,19 @@ "\\''"
indian-mlm-itrans-v5-hash "malayalam-itrans" "Malayalam" "MlmIT"
"Malayalam transliteration by ITRANS method.")
-(defvar quail-tamil-itrans-syllable-table
- (let ((vowels
- '(("அ" nil "a")
- ("ஆ" "ா" "A")
- ("இ" "ி" "i")
- ("ஈ" "ீ" "I")
- ("உ" "ு" "u")
- ("ஊ" "ூ" "U")
- ("எ" "ெ" "e")
- ("ஏ" "ே" "E")
- ("ஐ" "ை" "ai")
- ("ஒ" "ொ" "o")
- ("ஓ" "ோ" "O")
- ("ஔ" "ௌ" "au")))
- (consonants
- '(("க" "k") ; U+0B95
- ("ங" "N^") ; U+0B99
- ("ச" "ch") ; U+0B9A
- ("ஞ" "JN") ; U+0B9E
- ("ட" "T") ; U+0B9F
- ("ண" "N") ; U+0BA3
- ("த" "t") ; U+0BA4
- ("ந" "n") ; U+0BA8
- ("ப" "p") ; U+0BAA
- ("ம" "m") ; U+0BAE
- ("ய" "y") ; U+0BAF
- ("ர" "r") ; U+0BB0
- ("ல" "l") ; U+0BB2
- ("வ" "v") ; U+0BB5
- ("ழ" "z") ; U+0BB4
- ("ள" "L") ; U+0BB3
- ("ற" "rh") ; U+0BB1
- ("ன" "nh") ; U+0BA9
- ("ஜ" "j") ; U+0B9C
- ("ஶ" nil) ; U+0BB6
- ("ஷ" "Sh") ; U+0BB7
- ("ஸ" "s") ; U+0BB8
- ("ஹ" "h") ; U+0BB9
- ("க்ஷ" "x" ) ; U+0B95
- ))
- (virama #x0BCD)
+;; FIXME: This only accepts a single translation for vowels. Ideally,
+;; we want it to support mutliple translation just like consonants.
+;; This also does not sort by vowel or consonant. :(
+(defun quail-tamil-itrans-compute-syllable-table (vowels consonants)
+ "Return the syllable table for the input method as a string.
+VOWELS is a list of (VOWEL SIGN TRANS) where VOWEL is a string or
+character representing the Tamil vowel character, SIGN is the
+vowel sign corresponding to VOWEL or nil for none, and TRANS is
+the input sequence to insert VOWEL.
+CONSONANTS is a list of (CONSONANT TRANS...) where CONSONANT is
+the Tamil consonant character, and TRANS is one or more strings
+that describe how to insert CONSONANT."
+ (let ((virama #x0BCD)
clm)
(with-temp-buffer
(insert "\n")
@@ -197,21 +169,42 @@ quail-tamil-itrans-syllable-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car c) (or (nth 1 v) ""))
(setq clm (+ clm 6)))
- (insert "\n" (or (nth 1 c) "")
- (propertize "\t" 'display '(space :align-to 4))
- "|")
- (setq clm 6)
-
- (dolist (v vowels)
- (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
- (if (nth 1 c) (list (nth 1 c) (nth 2 v)) (list "")))
- (setq clm (+ clm 6))))
+ (dolist (ct (cdr c))
+ (insert "\n" (or ct "")
+ (propertize "\t" 'display '(space :align-to 4))
+ "|")
+ (setq clm 6)
+ (dolist (v vowels)
+ (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
+ (if ct (list ct (nth 2 v)) (list "")))
+ (setq clm (+ clm 6)))))
(insert "\n")
(insert "----+")
(insert-char ?- 74)
(insert "\n")
(buffer-string))))
+(defvar quail-tamil-itrans-syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (let ((vowels (car indian-tml-base-table))
+ trans v ret)
+ (dotimes (i (length vowels))
+ (when (setq v (nth i vowels))
+ (setq trans (nth i (car indian-itrans-v5-table-for-tamil)))
+ (push (append v (list (if (listp trans) (car trans) trans)))
+ ret)))
+ (setq ret (nreverse ret))
+ ret)
+ (let ((consonants (cadr indian-tml-base-table))
+ trans c ret)
+ (dotimes (i (length consonants))
+ (when (setq c (nth i consonants))
+ (setq trans (nth i (cadr indian-itrans-v5-table-for-tamil)))
+ (push (cons c (if (listp trans) trans (list trans)))
+ ret)))
+ (setq ret (nreverse ret))
+ ret)))
+
(defvar quail-tamil-itrans-numerics-and-symbols-table
(let ((numerics '((?௰ "பத்து") (?௱ "நூறு") (?௲ "ஆயிரம்")))
(symbols '((?௳ "நாள்") (?௴ "மாதம்") (?௵ "வருடம்")
@@ -244,25 +237,28 @@ quail-tamil-itrans-numerics-and-symbols-table
(insert "\n")
(buffer-string))))
-(defun quail-tamil-itrans-compute-signs-table (digitp)
+(defun quail-tamil-itrans-compute-signs-table (digitp various)
"Compute the signs table for the tamil-itrans input method.
-If DIGITP is non-nil, include the digits translation as well."
- (let ((various '((?ஃ . "H") ("ஸ்ரீ" . "srii") (?ௐ)))
- (digits "௦௧௨௩௪௫௬௭௮௯")
+If DIGITP is non-nil, include the digits translation as well.
+If VARIOUS is non-nil, then it should a list of (CHAR TRANS)
+where CHAR is the character/string to translate and TRANS is
+CHAR's translation."
+ (let ((digits "௦௧௨௩௪௫௬௭௮௯")
(width 6) clm)
(with-temp-buffer
- (insert "\n" (make-string 18 ?-) "+")
- (when digitp (insert (make-string 60 ?-)))
+ (insert "\n" (make-string 18 ?-))
+ (when digitp
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(insert
(propertize "\t" 'display '(space :align-to 5)) "various"
- (propertize "\t" 'display '(space :align-to 18)) "|")
+ (propertize "\t" 'display '(space :align-to 18)))
(when digitp
(insert
- (propertize "\t" 'display '(space :align-to 45)) "digits"))
- (insert "\n" (make-string 18 ?-) "+")
+ "|" (propertize "\t" 'display '(space :align-to 45)) "digits"))
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-)))
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(setq clm 0)
@@ -270,7 +266,8 @@ quail-tamil-itrans-compute-signs-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car (nth i various)))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
@@ -281,25 +278,28 @@ quail-tamil-itrans-compute-signs-table
(setq clm 0)
(dotimes (i (length various))
(insert (propertize "\t" 'display (list 'space :align-to clm))
- (or (cdr (nth i various)) ""))
+ (or (cadr (nth i various)) ""))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
(insert (propertize "\t" 'display (list 'space :align-to clm))
(format "%d" i))
(setq clm (+ clm width))))
- (insert "\n" (make-string 18 ?-) "+")
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-) "\n"))
+ (insert "+" (make-string 60 ?-) "\n"))
(buffer-string))))
(defvar quail-tamil-itrans-various-signs-and-digits-table
- (quail-tamil-itrans-compute-signs-table t))
+ (quail-tamil-itrans-compute-signs-table
+ t '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(defvar quail-tamil-itrans-various-signs-table
- (quail-tamil-itrans-compute-signs-table nil))
+ (quail-tamil-itrans-compute-signs-table
+ nil '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(if nil
(quail-define-package "tamil-itrans" "Tamil" "TmlIT" t "Tamil ITRANS"))
@@ -347,6 +347,174 @@ quail-tamil-itrans-various-signs-table
Full key sequences are listed below:")
+;;;
+;;; Tamil phonetic input method
+;;;
+
+(defvaralias 'tamil-uyir-translation 'tamil-vowel-translation)
+(defcustom tamil-vowel-translation
+ '(("அ" "a") ("ஆ" "aa") ("இ" "i") ("ஈ" "ii")
+ ("உ" "u") ("ஊ" "uu") ("எ" "e") ("ஏ" "ee")
+ ("ஐ" "ai") ("ஒ" "o") ("ஓ" "oo") ("ஔ" "au" "ow"))
+ "List of input sequences to translate to Tamil vowels.
+Each element should be (VOWEL . TRANSLATIONS) where VOWEL is the
+Tamil vowel character (உயிரெழுத்து) and TRANSLATIONS is the
+list of input sequences to translate to that vowel."
+ :group 'leim
+ :version "29.1"
+ :type '(alist :key string :value-type (repeat string))
+ :options (delq nil
+ (mapcar (lambda (x) (and (consp x) (string (car x))))
+ (car indian-tml-base-table))))
+
+(defvaralias 'tamil-mei-translation 'tamil-consonant-translation)
+(defcustom tamil-consonant-translation
+ '(("க்" "k" "g") ("ங்" "ng") ("ச்" "ch" "s") ("ஞ்" "nj") ("ட்" "t" "d")
+ ("ண்" "N") ("த்" "th" "dh") ("ந்" "nh") ("ப்" "p" "b") ("ம்" "m")
+ ("ய்" "y") ("ர்" "r") ("ல்" "l") ("வ்" "v") ("ழ்" "z" "zh")
+ ("ள்" "L") ("ற்" "rh") ("ன்" "n")
+ ;; Sanskrit.
+ ("ஜ்" "j") ("ஸ்" "S") ("க்ஷ்" "ksH") ("ஷ்" "sh") ("ஹ்" "h")
+ ("க்ஷ்" "ksh") ("ஶ்" "Z"))
+ "List of input sequences to translate to Tamil consonants.
+Each element should be (VOWEL . TRANSLATIONS) where VOWEL is the
+Tamil consonant character (மெய் எழுத்து) and TRANSLATIONS is a list
+of input sequences to translate to that consonant."
+ :group 'leim
+ :version "29.1"
+ :type '(alist :key string :value-type (repeat string))
+ :options (delq nil
+ (mapcar (lambda (x) (if (stringp x)
+ (concat x "்")
+ ;; #x0BCD = pulli/virama.
+ (and x (string x #x0BCD))))
+ (cadr indian-tml-base-table))))
+
+(defcustom tamil-misc-translation
+ ;; ஃ is not a vowel or a consonant.
+ '(("ஃ" "F" "q")
+ ("ௐ" "OM"))
+ "List of input sequences to translate to various Tamil characters.
+Each should element should be (CHARACTER . TRANSLATIONS) where
+CHARACTER may be any string and TRANSLATIONS is a list of input
+sequences to translate to that CHARACTER."
+ :group 'leim
+ :version "29.1"
+ :type '(alist :key string :value-type (repeat string)))
+
+(defcustom tamil-native-digits nil
+ "When non-nil, use Tamil native digits instead of Arabic ones."
+ :group 'leim
+ :version "29.1"
+ :type 'boolean)
+
+(defvar tamil--syllable-table nil)
+(defvar tamil--signs-table nil)
+(defvar tamil--hashtables
+ (cons (make-hash-table :test #'equal)
+ (make-hash-table :test #'equal)))
+(defvar tamil--vowel-signs
+ '(("அ" . nil) ("ஆ" . ?ா) ("இ" . ?ி) ("ஈ" . ?ீ)
+ ("உ" . ?ு) ("ஊ" . ?ூ) ("எ" . ?ெ) ("ஏ" . ?ே)
+ ("ஐ" . ?ை) ("ஒ" . ?ொ) ("ஓ" . ?ோ) ("ஔ" . ?ௌ)))
+
+(defun tamil--make-tables ()
+ "Return vowels, consonants, and their translation rules.
+The returned table is a list (VOWELS CONSONANTS VTRANS CTRANS)
+where VOWELS is a list of (VOWEL VOWEL-SIGN) where VOWEL is the
+Tamil vowel character, and VOWEL-SIGN is its sign, CONSONANTS is
+a list of consonants that have translation rules.
+VTRANS is a list of translation rules for vowels in VOWELS in the
+order they appear, CTRANS is a list of translation rules for
+consonants in CONSONANTS."
+ (let (c-table v-table v-trans c-trans)
+ (dolist (v tamil-vowel-translation)
+ (push (list (car v) (assoc-default (car v) tamil--vowel-signs)) v-table)
+ (push (cdr v) v-trans))
+ (dolist (c tamil-consonant-translation)
+ (push (cdr c) c-trans)
+ ;; Remove pulli/virama from consonant entry.
+ (push (substring (car c) 0 -1) c-table))
+ ;; FIXME: Remove when sorting is done properly!
+ (setq c-table (nreverse c-table)
+ v-table (nreverse v-table)
+ c-trans (nreverse c-trans)
+ v-trans (nreverse v-trans))
+ (list v-table c-table
+ v-trans c-trans)))
+
+(defun tamil--update-quail-rules ()
+ ;; This function does pretty much what `indian-make-hash' does
+ ;; except that we don't try to copy the structure of
+ ;; `indian-tml-base-table' which leads to less code hassle.
+ (let* ((quail-current-package (assoc-default "tamil" quail-package-alist))
+ (tables (tamil--make-tables))
+ (vowels (nth 0 tables))
+ (vowels-trans (nth 2 tables))
+ (consonants (nth 1 tables))
+ (consonants-trans (nth 3 tables))
+ (pulli (string #x0BCD)))
+ (clrhash (car tamil--hashtables))
+ (clrhash (cdr tamil--hashtables))
+ (indian--puthash-v vowels vowels-trans tamil--hashtables)
+ (indian--puthash-c consonants consonants-trans pulli tamil--hashtables)
+ (indian--puthash-cv consonants consonants-trans
+ vowels vowels-trans tamil--hashtables)
+ (indian--puthash-m (mapcar #'car tamil-misc-translation)
+ (mapcar #'cdr tamil-misc-translation)
+ tamil--hashtables)
+ (when tamil-native-digits
+ (indian--puthash-m (nth 3 indian-tml-base-digits-table)
+ '("0" "1" "2" "3" "4" "5" "6" "7" "8" "9")
+ tamil--hashtables))
+ ;; Now override the current translation rules.
+ ;; Empty quail map is '(list nil)'.
+ (setf (nth 2 quail-current-package) '(()))
+ (maphash (lambda (k v)
+ (quail-defrule k (if (length= v 1)
+ (string-to-char v)
+ (vector v))))
+ (cdr tamil--hashtables))
+ (setq tamil--syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (mapcar (lambda (v) (append v (pop vowels-trans))) vowels)
+ (mapcar (lambda (c) (cons c (pop consonants-trans))) consonants))
+ tamil--signs-table
+ ;; FIXME: This should also show how to input ஸ்ரீ (default: Srii).
+ (quail-tamil-itrans-compute-signs-table
+ tamil-native-digits tamil-misc-translation))))
+
+(quail-define-package
+ "tamil" "Tamil" "ழ" t
+ "Customisable Tamil phonetic input method.
+To change the translation of vowels (உயிரெத்துக்கள்), customize `tamil-vowel-translation'.
+To change the translation of consonants (மெய் எழுத்துக்கள்), customize
+ `tamil-consonant-translation'.
+To input miscellaneous characters (including ஃ), customize
+ `tamil-misc-translation'.
+To use native Tamil digits, customize `tamil-native-digits'.
+
+To end the current translation process, say \\<quail-translation-keymap>\\[quail-select-current] (defined in
+`quail-translation-keymap'). This is useful when there's a
+ conflict between two possible translation.
+
+The current input scheme is:
+
+### Basic syllables (உயிர்மெய் எழுத்துக்கள்) ###
+\\=\\<tamil--syllable-table>
+
+### Miscellaneous ####
+\\=\\<tamil--signs-table>
+
+The following characters have NO input sequence associated with
+them by default. Their descriptions are included for easy
+reference.
+\\=\\<quail-tamil-itrans-numerics-and-symbols-table>
+
+Full key sequences are listed below:"
+ nil nil nil nil nil nil t)
+(tamil--update-quail-rules)
+
;;;
;;; Input by Inscript
;;;
--
2.35.1
[-- Attachment #3: Type: text/plain, Size: 22 bytes --]
Sorry for the noise.
^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 12:59 ` bug#56323: 29.0.50; [v2] " Visuwesh
2022-07-01 13:01 ` Visuwesh
@ 2022-07-01 13:22 ` Eli Zaretskii
2022-07-01 13:47 ` Visuwesh
1 sibling, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-01 13:22 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Date: Fri, 01 Jul 2022 18:29:00 +0530
>
> Also, can someone guide me to write a sort function for
> quail-tamil-itrans-compute-syllable-table please? The ideal order of
> consonants should be the same as the one in the default value of
> tamil-consonant-translation, same for tamil-vowel-translation. I tried
> the following
>
> (sort (reverse (mapcar #'car tamil-consonant-translation))
> (lambda (x y) (let ((lx (length x))
> (ly (length y)))
> (if (= lx ly) (string-lessp x y) (< lx ly)))))
>
>
> but that definitely doesn't do what I want. The idea was to sort the
> list so that the basic consonants (க் ங் ச் etc.) first then the composite
> ones (க்ஷ் க்ஷ் etc.) but `string-lessp' does not even sort the basic
> consonants in the right order (the right order being the order in the
> default value of `tamil-consonant-translation').
Then you'll need to write your own comparison function and use it
instead string-lessp.
> Can I use the min-width property in buffer text?
Why do you need that? Please tell more about what you want to
accomplish.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 13:22 ` Eli Zaretskii
@ 2022-07-01 13:47 ` Visuwesh
2022-07-01 14:06 ` Eli Zaretskii
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-01 13:47 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[-- Attachment #1: Type: text/plain, Size: 3006 bytes --]
[வெள்ளி ஜூலை 01, 2022] Eli Zaretskii wrote:
>> From: Visuwesh <visuweshm@gmail.com>
>> Date: Fri, 01 Jul 2022 18:29:00 +0530
>>
>> Also, can someone guide me to write a sort function for
>> quail-tamil-itrans-compute-syllable-table please? The ideal order of
>> consonants should be the same as the one in the default value of
>> tamil-consonant-translation, same for tamil-vowel-translation. I tried
>> the following
>>
>> (sort (reverse (mapcar #'car tamil-consonant-translation))
>> (lambda (x y) (let ((lx (length x))
>> (ly (length y)))
>> (if (= lx ly) (string-lessp x y) (< lx ly)))))
>>
>>
>> but that definitely doesn't do what I want. The idea was to sort the
>> list so that the basic consonants (க் ங் ச் etc.) first then the composite
>> ones (க்ஷ் க்ஷ் etc.) but `string-lessp' does not even sort the basic
>> consonants in the right order (the right order being the order in the
>> default value of `tamil-consonant-translation').
>
> Then you'll need to write your own comparison function and use it
> instead string-lessp.
>
I suppose so. How does the following look?
(sort
'("க்" "ங்" "ச்" "ஞ்" "ட்" "ண்" "ற்ற்" "ந்" "ப்" "ய்"
"ம்" "த்" "ர்" "ல்" "வ்" "ள்" "ற்" "ழ்" "ன்"
"ஸ்" "ஜ்" "க்ஷ்" "ஷ்" "ஹ்" "க்ஷ்" "ஶ்")
(lambda (x y)
(let* ((cp '(("க்" . 0) ("ங்" . 1) ("ச்" . 2) ("ஞ்" . 3) ("ட்" . 4) ("ண்" . 5)
("த்" . 6) ("ந்" . 7) ("ப்" . 8) ("ம்" . 9) ("ய்" . 10) ("ர்" . 11)
("ல்" . 12) ("வ்" . 13) ("ழ்" . 14) ("ள்" . 15) ("ற்" . 16) ("ன்" . 17)
("ஜ்" . 18) ("ஸ்" . 19) ("ஷ்" . 20) ("ஹ்" . 21) ("க்ஷ்" . 22)
("க்ஷ்" . 23) ("ஶ்" . 24)))
(xp (or (assoc-default x cp nil) 10000))
(yp (or (assoc-default y cp nil) 10000)))
(< xp yp))))
[ I won't have the unnecessary let in the final version. ]
>> Can I use the min-width property in buffer text?
>
> Why do you need that? Please tell more about what you want to
> accomplish.
Currently we don't try too hard to ensure that text don't bump into each
other in the tables we calculate. If you are unlucky, then the table
will be incomprehensible so I thought about putting a reasonable
min-width value on the text in signs table at least. Of course, finding
a reasonable value is a headache in of itself; the better solution would
be probably pulling in the vtable library but I'm not too sure about
that.
I also attached a screenshot comparing my running Emacs session and
emacs -Q (yellow window is my current Emacs session) to get the point
across better.
[-- Attachment #2: screenshot_202207011914.png --]
[-- Type: image/png, Size: 61371 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 13:47 ` Visuwesh
@ 2022-07-01 14:06 ` Eli Zaretskii
2022-07-01 14:30 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-01 14:06 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Fri, 01 Jul 2022 19:17:18 +0530
>
> > Then you'll need to write your own comparison function and use it
> > instead string-lessp.
> >
>
> I suppose so. How does the following look?
>
> (sort
> '("க்" "ங்" "ச்" "ஞ்" "ட்" "ண்" "ற்ற்" "ந்" "ப்" "ய்"
> "ம்" "த்" "ர்" "ல்" "வ்" "ள்" "ற்" "ழ்" "ன்"
> "ஸ்" "ஜ்" "க்ஷ்" "ஷ்" "ஹ்" "க்ஷ்" "ஶ்")
> (lambda (x y)
> (let* ((cp '(("க்" . 0) ("ங்" . 1) ("ச்" . 2) ("ஞ்" . 3) ("ட்" . 4) ("ண்" . 5)
> ("த்" . 6) ("ந்" . 7) ("ப்" . 8) ("ம்" . 9) ("ய்" . 10) ("ர்" . 11)
> ("ல்" . 12) ("வ்" . 13) ("ழ்" . 14) ("ள்" . 15) ("ற்" . 16) ("ன்" . 17)
> ("ஜ்" . 18) ("ஸ்" . 19) ("ஷ்" . 20) ("ஹ்" . 21) ("க்ஷ்" . 22)
> ("க்ஷ்" . 23) ("ஶ்" . 24)))
> (xp (or (assoc-default x cp nil) 10000))
> (yp (or (assoc-default y cp nil) 10000)))
> (< xp yp))))
I don't think I understand what you want to achieve, and don't read
Tamil in the first place, to tell you whether this is correct or not,
sorry.
> >> Can I use the min-width property in buffer text?
> >
> > Why do you need that? Please tell more about what you want to
> > accomplish.
>
> Currently we don't try too hard to ensure that text don't bump into each
> other in the tables we calculate. If you are unlucky, then the table
> will be incomprehensible so I thought about putting a reasonable
> min-width value on the text in signs table at least. Of course, finding
> a reasonable value is a headache in of itself; the better solution would
> be probably pulling in the vtable library but I'm not too sure about
> that.
I think it would be better to be more accurate in alignment of table
cells. We do have string-width and string-pixel-width, let alone
window-text-pixel-size.
> I also attached a screenshot comparing my running Emacs session and
> emacs -Q (yellow window is my current Emacs session) to get the point
> across better.
Looks like simple misalignment to me, which should be cured by using
pixel-resolution alignment features.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 14:06 ` Eli Zaretskii
@ 2022-07-01 14:30 ` Visuwesh
2022-07-01 16:09 ` Eli Zaretskii
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-01 14:30 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[வெள்ளி ஜூலை 01, 2022] Eli Zaretskii wrote:
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 56323@debbugs.gnu.org
>> Date: Fri, 01 Jul 2022 19:17:18 +0530
>>
>> > Then you'll need to write your own comparison function and use it
>> > instead string-lessp.
>> >
>>
>> I suppose so. How does the following look?
>>
>> (sort
>> '("க்" "ங்" "ச்" "ஞ்" "ட்" "ண்" "ற்ற்" "ந்" "ப்" "ய்"
>> "ம்" "த்" "ர்" "ல்" "வ்" "ள்" "ற்" "ழ்" "ன்"
>> "ஸ்" "ஜ்" "க்ஷ்" "ஷ்" "ஹ்" "க்ஷ்" "ஶ்")
>> (lambda (x y)
>> (let* ((cp '(("க்" . 0) ("ங்" . 1) ("ச்" . 2) ("ஞ்" . 3) ("ட்" . 4) ("ண்" . 5)
>> ("த்" . 6) ("ந்" . 7) ("ப்" . 8) ("ம்" . 9) ("ய்" . 10) ("ர்" . 11)
>> ("ல்" . 12) ("வ்" . 13) ("ழ்" . 14) ("ள்" . 15) ("ற்" . 16) ("ன்" . 17)
>> ("ஜ்" . 18) ("ஸ்" . 19) ("ஷ்" . 20) ("ஹ்" . 21) ("க்ஷ்" . 22)
>> ("க்ஷ்" . 23) ("ஶ்" . 24)))
>> (xp (or (assoc-default x cp nil) 10000))
>> (yp (or (assoc-default y cp nil) 10000)))
>> (< xp yp))))
>
> I don't think I understand what you want to achieve, and don't read
> Tamil in the first place, to tell you whether this is correct or not,
> sorry.
>
I mostly meant to ask if the weighted approach was good but I wasn't
clear enough, sorry. Let me try to explain it better:
Let's suppose that string-lessp does not work for English for the
discussion here. The task is to sort a list of jumbled English
alphabets in alphabetical order. What I'm currently doing is creating
an alist where the key is the alphabet and the value is the alphabet's
order (so a will be 1, b will be 2, etc.). Then in the sort function, I
look for this order. If the alphabet is not in this list, then I fall
back to a large number.
So the code above would look like this if it were in English,
(sort '("b" "z" "c" "n" "a" "aa" "p")
(lambda (x y)
(let ((cp '(("a" . 0) ("b" . 1) ("c" . 2) ("d" . 3) ("e" . 4)
("f" . 5) ("g" . 6) ("h" . 7) ("i" . 8) ("j" . 9)
("k" . 10) ("l" . 11) ("m" . 12) ("n" . 13) ("o" . 14)
("p" . 15) ("q" . 16) ("r" . 17) ("s" . 18) ("t" . 19)
("u" . 20) ("v" . 21) ("w" . 22) ("x" . 23) ("y" . 24)
("z" . 25))))
(< (or (assoc-default x cp) 10000)
(or (assoc-default y cp) 10000)))))
and the sorted list comes out as ("a" "b" "c" "n" "p" "z" "aa")
which is exactly what I desire. I hope this is clear enough.
Obviously, I don't have much programming experience, so I'm unsure if
there's a better way to sort.
>> >> Can I use the min-width property in buffer text?
>> >
>> > Why do you need that? Please tell more about what you want to
>> > accomplish.
>>
>> Currently we don't try too hard to ensure that text don't bump into each
>> other in the tables we calculate. If you are unlucky, then the table
>> will be incomprehensible so I thought about putting a reasonable
>> min-width value on the text in signs table at least. Of course, finding
>> a reasonable value is a headache in of itself; the better solution would
>> be probably pulling in the vtable library but I'm not too sure about
>> that.
>
> I think it would be better to be more accurate in alignment of table
> cells. We do have string-width and string-pixel-width, let alone
> window-text-pixel-size.
>
>> I also attached a screenshot comparing my running Emacs session and
>> emacs -Q (yellow window is my current Emacs session) to get the point
>> across better.
>
> Looks like simple misalignment to me, which should be cured by using
> pixel-resolution alignment features.
Yep, it is misalignment. I could try to use those pixel-resolution
alignment features but I really don't think I can do a good enough job.
It is something I tried in the past but gave up since it was too complex
for me. The current code produces a Good Enough™ table and I think I
will just leave it unless Someone™ complains since after all, the
current situation is much better than what we have in Emacs 28 (the
docfix that happened as part of bug#50143 isn't in Emacs 28).
Maybe someday, I will be annoyed enough at the misalignment to come back
and fix it. But until that day, I will just leave the code as is.
BTW, do you have any other code/documentation review? And what about
the patch I posted in https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html?
No rush but I would like to know if it can go in since it only addresses
fallouts from the previous bug in this area. Thanks.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 14:30 ` Visuwesh
@ 2022-07-01 16:09 ` Eli Zaretskii
2022-07-01 16:37 ` Visuwesh
2022-07-02 12:15 ` Visuwesh
0 siblings, 2 replies; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-01 16:09 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Fri, 01 Jul 2022 20:00:03 +0530
>
> > I don't think I understand what you want to achieve, and don't read
> > Tamil in the first place, to tell you whether this is correct or not,
> > sorry.
> >
>
> I mostly meant to ask if the weighted approach was good but I wasn't
> clear enough, sorry. Let me try to explain it better:
>
> Let's suppose that string-lessp does not work for English for the
> discussion here. The task is to sort a list of jumbled English
> alphabets in alphabetical order. What I'm currently doing is creating
> an alist where the key is the alphabet and the value is the alphabet's
> order (so a will be 1, b will be 2, etc.). Then in the sort function, I
> look for this order. If the alphabet is not in this list, then I fall
> back to a large number.
>
> So the code above would look like this if it were in English,
>
> (sort '("b" "z" "c" "n" "a" "aa" "p")
> (lambda (x y)
> (let ((cp '(("a" . 0) ("b" . 1) ("c" . 2) ("d" . 3) ("e" . 4)
> ("f" . 5) ("g" . 6) ("h" . 7) ("i" . 8) ("j" . 9)
> ("k" . 10) ("l" . 11) ("m" . 12) ("n" . 13) ("o" . 14)
> ("p" . 15) ("q" . 16) ("r" . 17) ("s" . 18) ("t" . 19)
> ("u" . 20) ("v" . 21) ("w" . 22) ("x" . 23) ("y" . 24)
> ("z" . 25))))
> (< (or (assoc-default x cp) 10000)
> (or (assoc-default y cp) 10000)))))
>
> and the sorted list comes out as ("a" "b" "c" "n" "p" "z" "aa")
> which is exactly what I desire. I hope this is clear enough.
The above just gives each letter its order in the alphabet. But if
that is what you wanted, string-lessp (or even just direct comparison
of characters) would have worked for you. So there's still something
important missing from your description, I think.
> > Looks like simple misalignment to me, which should be cured by using
> > pixel-resolution alignment features.
>
> Yep, it is misalignment. I could try to use those pixel-resolution
> alignment features but I really don't think I can do a good enough job.
> It is something I tried in the past but gave up since it was too complex
> for me. The current code produces a Good Enough™ table and I think I
> will just leave it unless Someone™ complains since after all, the
> current situation is much better than what we have in Emacs 28 (the
> docfix that happened as part of bug#50143 isn't in Emacs 28).
I thought vtable.el was about solving such problems?
> BTW, do you have any other code/documentation review? And what about
> the patch I posted in https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html?
> No rush but I would like to know if it can go in since it only addresses
> fallouts from the previous bug in this area. Thanks.
It sounded to me like you are still working on the code, so I didn't
see a need to review it. If you have specific parts that you'd like
me to review nonetheless, please tell which parts are those.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 16:09 ` Eli Zaretskii
@ 2022-07-01 16:37 ` Visuwesh
2022-07-01 18:16 ` Eli Zaretskii
2022-07-02 6:58 ` Eli Zaretskii
2022-07-02 12:15 ` Visuwesh
1 sibling, 2 replies; 42+ messages in thread
From: Visuwesh @ 2022-07-01 16:37 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[வெள்ளி ஜூலை 01, 2022] Eli Zaretskii wrote:
>> I mostly meant to ask if the weighted approach was good but I wasn't
>> clear enough, sorry. Let me try to explain it better:
>>
>> Let's suppose that string-lessp does not work for English for the
>> discussion here. The task is to sort a list of jumbled English
>> alphabets in alphabetical order. What I'm currently doing is creating
>> an alist where the key is the alphabet and the value is the alphabet's
>> order (so a will be 1, b will be 2, etc.). Then in the sort function, I
>> look for this order. If the alphabet is not in this list, then I fall
>> back to a large number.
>>
>> So the code above would look like this if it were in English,
>>
>> (sort '("b" "z" "c" "n" "a" "aa" "p")
>> (lambda (x y)
>> (let ((cp '(("a" . 0) ("b" . 1) ("c" . 2) ("d" . 3) ("e" . 4)
>> ("f" . 5) ("g" . 6) ("h" . 7) ("i" . 8) ("j" . 9)
>> ("k" . 10) ("l" . 11) ("m" . 12) ("n" . 13) ("o" . 14)
>> ("p" . 15) ("q" . 16) ("r" . 17) ("s" . 18) ("t" . 19)
>> ("u" . 20) ("v" . 21) ("w" . 22) ("x" . 23) ("y" . 24)
>> ("z" . 25))))
>> (< (or (assoc-default x cp) 10000)
>> (or (assoc-default y cp) 10000)))))
>>
>> and the sorted list comes out as ("a" "b" "c" "n" "p" "z" "aa")
>> which is exactly what I desire. I hope this is clear enough.
>
> The above just gives each letter its order in the alphabet. But if
> that is what you wanted, string-lessp (or even just direct comparison
> of characters) would have worked for you. So there's still something
> important missing from your description, I think.
>
Unfortunately, string-lessp does not do the job. (string-lessp "ஞ" "ஜ")
should return t but it returns nil probably because ஞ's codepoint is
2974 and ஜ's codepoint is 2972. But ஜ is not even part of the "core"
Tamil characters and hence should come at last. This is why I went with
defining an alist with the _actual_ order of the characters. I hope
this is clear: to demonstrate this using English, it would be something
like...
c's codepoint is 29 and d's codepoint is 27. Clearly, c comes
before d but since string-lessp seems to rely on the Unicode
codepoint, when we do the sorting with string-lessp, we get
"... d c ..." in the list instead of the desired "... c d ...".
I hope this is clear.
>> Yep, it is misalignment. I could try to use those pixel-resolution
>> alignment features but I really don't think I can do a good enough job.
>> It is something I tried in the past but gave up since it was too complex
>> for me. The current code produces a Good Enough™ table and I think I
>> will just leave it unless Someone™ complains since after all, the
>> current situation is much better than what we have in Emacs 28 (the
>> docfix that happened as part of bug#50143 isn't in Emacs 28).
>
> I thought vtable.el was about solving such problems?
Okay then, I will use that. I was mostly unsure if using vtable would
be alright especially since it puts keymap properties and the entire
vtable object as a text property -- it seemed too excessive for a
docstring. Maybe some of this can be addressed?
>> BTW, do you have any other code/documentation review? And what about
>> the patch I posted in https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html?
>> No rush but I would like to know if it can go in since it only addresses
>> fallouts from the previous bug in this area. Thanks.
>
> It sounded to me like you are still working on the code, so I didn't
> see a need to review it. If you have specific parts that you'd like
> me to review nonetheless, please tell which parts are those.
Thanks. The patch I posted in
https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html
is done, and can be pushed to master if you see no problems. All it
does is address a few fallouts that were accidentally left out when
fixing bug#50143. Specifically, it adds an entry for the TAMIL OM
character, and adds two more Sanskrit consonants to the Tamil itrans
table.
Also, I would like to know if there's a better to write the :set
function for the defcustoms tamil-vowel-translation,
tamil-consonant-translation, tamil-misc-translation, tamil-native-digits
without the boundp check chain below,
(defun tamil--set-variable (sym val)
(set-default sym val)
(when (and (boundp 'tamil-vowel-translation)
(boundp 'tamil-consonant-translation)
(boundp 'tamil-misc-translation)
(boundp 'tamil-native-digits))
(tamil--update-quail-rules)))
I'm also doubtful about the current group being used for these
defcustoms. Should I go ahead and make a new 'tamil' group and make it
a subgroup of leim or i18n? And is the prefix tamil- okay or should I
change it to something else?
Finally, I'm unsure if "List of input sequences to translate to ..." is
clear. I think it sounds a mouthful and there should be a better way to
put it. I think "translation rules" is quite nice but I'm afraid that
it is too Quail specific and might not be well understood.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 16:37 ` Visuwesh
@ 2022-07-01 18:16 ` Eli Zaretskii
2022-07-02 4:02 ` Visuwesh
2022-07-02 6:58 ` Eli Zaretskii
1 sibling, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-01 18:16 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Fri, 01 Jul 2022 22:07:38 +0530
>
> >> (sort '("b" "z" "c" "n" "a" "aa" "p")
> >> (lambda (x y)
> >> (let ((cp '(("a" . 0) ("b" . 1) ("c" . 2) ("d" . 3) ("e" . 4)
> >> ("f" . 5) ("g" . 6) ("h" . 7) ("i" . 8) ("j" . 9)
> >> ("k" . 10) ("l" . 11) ("m" . 12) ("n" . 13) ("o" . 14)
> >> ("p" . 15) ("q" . 16) ("r" . 17) ("s" . 18) ("t" . 19)
> >> ("u" . 20) ("v" . 21) ("w" . 22) ("x" . 23) ("y" . 24)
> >> ("z" . 25))))
> >> (< (or (assoc-default x cp) 10000)
> >> (or (assoc-default y cp) 10000)))))
> >>
> >> and the sorted list comes out as ("a" "b" "c" "n" "p" "z" "aa")
> >> which is exactly what I desire. I hope this is clear enough.
> >
> > The above just gives each letter its order in the alphabet. But if
> > that is what you wanted, string-lessp (or even just direct comparison
> > of characters) would have worked for you. So there's still something
> > important missing from your description, I think.
> >
>
> Unfortunately, string-lessp does not do the job. (string-lessp "ஞ" "ஜ")
> should return t but it returns nil probably because ஞ's codepoint is
> 2974 and ஜ's codepoint is 2972. But ஜ is not even part of the "core"
> Tamil characters and hence should come at last. This is why I went with
> defining an alist with the _actual_ order of the characters.
Please tell what is the actual order of the characters. That is,
where is that order defined, and by what criteria?
I'll look into the other issues later.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 18:16 ` Eli Zaretskii
@ 2022-07-02 4:02 ` Visuwesh
2022-07-02 6:35 ` Eli Zaretskii
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-02 4:02 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[வெள்ளி ஜூலை 01, 2022] Eli Zaretskii wrote:
>> Unfortunately, string-lessp does not do the job. (string-lessp "ஞ" "ஜ")
>> should return t but it returns nil probably because ஞ's codepoint is
>> 2974 and ஜ's codepoint is 2972. But ஜ is not even part of the "core"
>> Tamil characters and hence should come at last. This is why I went with
>> defining an alist with the _actual_ order of the characters.
>
> Please tell what is the actual order of the characters. That is,
> where is that order defined, and by what criteria?
I'm not sure what you mean "where is that order defined," I don't think
there is a definition per se, it just happens to be so.
There are two "classes" of consonants: those that are part of Tamil
(let's call them "core") and those borrowed from Sanskrit. When one
writes the consonants in order, the core consonants come first then the
Sanskrit ones. You can find the order of the core consonants in
wikipedia here in the table titled "Tamil consonants":
https://en.wikipedia.org/wiki/Tamil_script#Letters
We need not worry too much about the order of Sanskrit consonants, we
just need to ensure that they come after the core consonants. You can
find these Sanskrit consonants in the table titled "Grantha consonants
in Tamil" in the same link.
I hope this is clear.
As for the criteria, it is simply "Tamil consonants then the Sanskrit
consonants."
> I'll look into the other issues later.
Thanks.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 4:02 ` Visuwesh
@ 2022-07-02 6:35 ` Eli Zaretskii
2022-07-02 6:54 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 6:35 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sat, 02 Jul 2022 09:32:34 +0530
>
> > Please tell what is the actual order of the characters. That is,
> > where is that order defined, and by what criteria?
>
> I'm not sure what you mean "where is that order defined," I don't think
> there is a definition per se, it just happens to be so.
>
> There are two "classes" of consonants: those that are part of Tamil
> (let's call them "core") and those borrowed from Sanskrit. When one
> writes the consonants in order, the core consonants come first then the
> Sanskrit ones. You can find the order of the core consonants in
> wikipedia here in the table titled "Tamil consonants":
> https://en.wikipedia.org/wiki/Tamil_script#Letters
>
> We need not worry too much about the order of Sanskrit consonants, we
> just need to ensure that they come after the core consonants. You can
> find these Sanskrit consonants in the table titled "Grantha consonants
> in Tamil" in the same link.
>
> I hope this is clear.
>
> As for the criteria, it is simply "Tamil consonants then the Sanskrit
> consonants."
Then your comparison function should first see whether a character is
in the former or the latter group, and use string-lessp or character
codepoint comparison with each group, right? But that's not what you
did, so I wonder whether my understanding is correct.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 6:35 ` Eli Zaretskii
@ 2022-07-02 6:54 ` Visuwesh
2022-07-02 7:17 ` Eli Zaretskii
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-02 6:54 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 56323@debbugs.gnu.org
>> Date: Sat, 02 Jul 2022 09:32:34 +0530
>>
>> > Please tell what is the actual order of the characters. That is,
>> > where is that order defined, and by what criteria?
>>
>> I'm not sure what you mean "where is that order defined," I don't think
>> there is a definition per se, it just happens to be so.
>>
>> There are two "classes" of consonants: those that are part of Tamil
>> (let's call them "core") and those borrowed from Sanskrit. When one
>> writes the consonants in order, the core consonants come first then the
>> Sanskrit ones. You can find the order of the core consonants in
>> wikipedia here in the table titled "Tamil consonants":
>> https://en.wikipedia.org/wiki/Tamil_script#Letters
>>
>> We need not worry too much about the order of Sanskrit consonants, we
>> just need to ensure that they come after the core consonants. You can
>> find these Sanskrit consonants in the table titled "Grantha consonants
>> in Tamil" in the same link.
>>
>> I hope this is clear.
>>
>> As for the criteria, it is simply "Tamil consonants then the Sanskrit
>> consonants."
>
> Then your comparison function should first see whether a character is
> in the former or the latter group, and use string-lessp or character
> codepoint comparison with each group, right? But that's not what you
> did, so I wonder whether my understanding is correct.
It didn't occur to me to do it this way so I tried it out but then I
noticed, string-lessp even within a group won't work. When you evaluate
the following sexp, you don't get a list of increasing numbers...
(let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
"ந" "ப" "ம" "ய" "ர" "ல"
"வ" "ழ" "ள" "ற" "ன")))
(mapcar (lambda (c) (string-to-char c)) core-consonants))
;; => (2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992
2994 2997 2996 2995 2993 2985)
and sure enough when you do (sort core-consonants #'string-lessp) the
list is jumbled up instead of retaining the order.
[ core-consonants, as declared, is in the right order but sort jumbles
it up. ]
But string-lessp works for vowels. It is the consonants that is the
problem.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 16:37 ` Visuwesh
2022-07-01 18:16 ` Eli Zaretskii
@ 2022-07-02 6:58 ` Eli Zaretskii
2022-07-02 7:58 ` Visuwesh
1 sibling, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 6:58 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Fri, 01 Jul 2022 22:07:38 +0530
>
> >> BTW, do you have any other code/documentation review? And what about
> >> the patch I posted in https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html?
> >> No rush but I would like to know if it can go in since it only addresses
> >> fallouts from the previous bug in this area. Thanks.
> >
> > It sounded to me like you are still working on the code, so I didn't
> > see a need to review it. If you have specific parts that you'd like
> > me to review nonetheless, please tell which parts are those.
>
> Thanks. The patch I posted in
> https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html
> is done, and can be pushed to master if you see no problems.
I installed it, thanks.
> Also, I would like to know if there's a better to write the :set
> function for the defcustoms tamil-vowel-translation,
> tamil-consonant-translation, tamil-misc-translation, tamil-native-digits
> without the boundp check chain below,
>
> (defun tamil--set-variable (sym val)
> (set-default sym val)
> (when (and (boundp 'tamil-vowel-translation)
> (boundp 'tamil-consonant-translation)
> (boundp 'tamil-misc-translation)
> (boundp 'tamil-native-digits))
> (tamil--update-quail-rules)))
Why do you need a single function for all of them? Would a separate
setter function for each defcustom do the job?
I also don't understand the need for the boundp tests -- the function
will live on the same indian.el file as the defcustoms, so if the
function is defined, the defcustoms are also bound, no?
> I'm also doubtful about the current group being used for these
> defcustoms. Should I go ahead and make a new 'tamil' group and make it
> a subgroup of leim or i18n?
It's okay to have a separate group, but what would be the subject of
this group? If it's just about input methods, the name had better
reflected that, and just "tamil" is too general for that.
> And is the prefix tamil- okay or should I change it to something
> else?
I see no problem with 'tamil-'.
> Finally, I'm unsure if "List of input sequences to translate to ..." is
> clear. I think it sounds a mouthful and there should be a better way to
> put it. I think "translation rules" is quite nice but I'm afraid that
> it is too Quail specific and might not be well understood.
I have no problem with that wording, but I wonder whether we should
have these defcustoms in the first place. What are the chances that
some user will want to change the sequences, and why would they want
that?
P.S. Please in the future don't modify the Subject of the messages in
the same bug report: that makes it harder to find related messages at
least when using Rmail.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 6:54 ` Visuwesh
@ 2022-07-02 7:17 ` Eli Zaretskii
2022-07-02 7:35 ` Eli Zaretskii
2022-07-02 8:11 ` Visuwesh
0 siblings, 2 replies; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 7:17 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sat, 02 Jul 2022 12:24:39 +0530
>
> [சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>
> >> There are two "classes" of consonants: those that are part of Tamil
> >> (let's call them "core") and those borrowed from Sanskrit. When one
> >> writes the consonants in order, the core consonants come first then the
> >> Sanskrit ones. You can find the order of the core consonants in
> >> wikipedia here in the table titled "Tamil consonants":
> >> https://en.wikipedia.org/wiki/Tamil_script#Letters
> >>
> >> We need not worry too much about the order of Sanskrit consonants, we
> >> just need to ensure that they come after the core consonants. You can
> >> find these Sanskrit consonants in the table titled "Grantha consonants
> >> in Tamil" in the same link.
> >>
> >> I hope this is clear.
> >>
> >> As for the criteria, it is simply "Tamil consonants then the Sanskrit
> >> consonants."
> >
> > Then your comparison function should first see whether a character is
> > in the former or the latter group, and use string-lessp or character
> > codepoint comparison with each group, right? But that's not what you
> > did, so I wonder whether my understanding is correct.
>
> It didn't occur to me to do it this way so I tried it out but then I
> noticed, string-lessp even within a group won't work. When you evaluate
> the following sexp, you don't get a list of increasing numbers...
>
> (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
> "ந" "ப" "ம" "ய" "ர" "ல"
> "வ" "ழ" "ள" "ற" "ன")))
> (mapcar (lambda (c) (string-to-char c)) core-consonants))
>
> ;; => (2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992
> 2994 2997 2996 2995 2993 2985)
>
> and sure enough when you do (sort core-consonants #'string-lessp) the
> list is jumbled up instead of retaining the order.
> [ core-consonants, as declared, is in the right order but sort jumbles
> it up. ]
>
> But string-lessp works for vowels. It is the consonants that is the
> problem.
Sorry, I don't understand what you are saying here. How is the above
code related to the issue at hand, which is how to sort characters in
the order you want them to be sorted? (And please keep in mind that I
don't even know which of those characters are consonants and which are
vowels -- if you want me to say something intelligent about that.)
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 7:17 ` Eli Zaretskii
@ 2022-07-02 7:35 ` Eli Zaretskii
2022-07-02 7:46 ` Eli Zaretskii
2022-07-02 8:11 ` Visuwesh
1 sibling, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 7:35 UTC (permalink / raw)
To: visuweshm; +Cc: 56323
> Cc: 56323@debbugs.gnu.org
> Date: Sat, 02 Jul 2022 10:17:56 +0300
> From: Eli Zaretskii <eliz@gnu.org>
>
> > (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
> > "ந" "ப" "ம" "ய" "ர" "ல"
> > "வ" "ழ" "ள" "ற" "ன")))
> > (mapcar (lambda (c) (string-to-char c)) core-consonants))
> >
> > ;; => (2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992
> > 2994 2997 2996 2995 2993 2985)
> >
> > and sure enough when you do (sort core-consonants #'string-lessp) the
> > list is jumbled up instead of retaining the order.
> > [ core-consonants, as declared, is in the right order but sort jumbles
> > it up. ]
> >
> > But string-lessp works for vowels. It is the consonants that is the
> > problem.
>
> Sorry, I don't understand what you are saying here. How is the above
> code related to the issue at hand, which is how to sort characters in
> the order you want them to be sorted? (And please keep in mind that I
> don't even know which of those characters are consonants and which are
> vowels -- if you want me to say something intelligent about that.)
Or maybe my guess below will be lucky. You probably want this:
(defun sort-by-codepoint (c1 c2)
(< (string-to-char c1) (string-to-char c2)))
(let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
"ந" "ப" "ம" "ய" "ர" "ல"
"வ" "ழ" "ள" "ற" "ன")))
(sort core-consonants 'sort-by-codepoint))
=> ("க" "ங" "ச" "ஞ" "ட" "ண" "த" "ந" "ன" "ப" "ம" "ய" "ர" "ற" "ல" "ள" "ழ" "வ")
(To understand why, read the doc string of 'sort' carefully, where it
explains what is expected from PREDICATE.)
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 7:35 ` Eli Zaretskii
@ 2022-07-02 7:46 ` Eli Zaretskii
0 siblings, 0 replies; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 7:46 UTC (permalink / raw)
To: visuweshm; +Cc: 56323
> Cc: 56323@debbugs.gnu.org
> Date: Sat, 02 Jul 2022 10:35:18 +0300
> From: Eli Zaretskii <eliz@gnu.org>
>
> (defun sort-by-codepoint (c1 c2)
> (< (string-to-char c1) (string-to-char c2)))
>
> (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
> "ந" "ப" "ம" "ய" "ர" "ல"
> "வ" "ழ" "ள" "ற" "ன")))
>
> (sort core-consonants 'sort-by-codepoint))
> => ("க" "ங" "ச" "ஞ" "ட" "ண" "த" "ந" "ன" "ப" "ம" "ய" "ர" "ற" "ல" "ள" "ழ" "வ")
>
> (To understand why, read the doc string of 'sort' carefully, where it
> explains what is expected from PREDICATE.)
Hmm... but if I use string-lessp instead of sort-by-codepoint, I get
the same result, as I'd expect. Which probably means I'm still
missing something.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 6:58 ` Eli Zaretskii
@ 2022-07-02 7:58 ` Visuwesh
2022-07-02 8:39 ` Eli Zaretskii
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-02 7:58 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 56323@debbugs.gnu.org
>> Date: Fri, 01 Jul 2022 22:07:38 +0530
>>
>> >> BTW, do you have any other code/documentation review? And what about
>> >> the patch I posted in https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html?
>> >> No rush but I would like to know if it can go in since it only addresses
>> >> fallouts from the previous bug in this area. Thanks.
>> >
>> > It sounded to me like you are still working on the code, so I didn't
>> > see a need to review it. If you have specific parts that you'd like
>> > me to review nonetheless, please tell which parts are those.
>>
>> Thanks. The patch I posted in
>> https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html
>> is done, and can be pushed to master if you see no problems.
>
> I installed it, thanks.
>
Thanks.
>> Also, I would like to know if there's a better to write the :set
>> function for the defcustoms tamil-vowel-translation,
>> tamil-consonant-translation, tamil-misc-translation, tamil-native-digits
>> without the boundp check chain below,
>>
>> (defun tamil--set-variable (sym val)
>> (set-default sym val)
>> (when (and (boundp 'tamil-vowel-translation)
>> (boundp 'tamil-consonant-translation)
>> (boundp 'tamil-misc-translation)
>> (boundp 'tamil-native-digits))
>> (tamil--update-quail-rules)))
>
> Why do you need a single function for all of them? Would a separate
> setter function for each defcustom do the job?
>
Because it is harder to clear the old translation rules and add the new
translation rules than clearing ALL translation rules and starting over
again. When the user changes tamil-vowel-translation, then not only
does the translation rule for the vowels change, we also need to change
the translation rules for consonant+vowel pairs so that means we need to
check if the consonant var is bound. (The translation rules for
consonant+vowel pairs are auto-generated based on the rules for vowels
and consonants.)
Similarly, when the consonant defcustom changes, we need to change both
the consonant and the consonant+vowel pair translation rules. Moreover,
if the user decides to delete an extra consonant translation, then we
need to smartly detect that and delete it from the current quail map.
Instead of all this, a simple clear ALL+start over approach is much
simpler. And since this approach doesn't take too much time, I don't
think implementing the smarter approach would be worth it.
Besides, even if this smart approach is easy to implement, quail-map
structure is just too hard to manipulate by hand...
> I also don't understand the need for the boundp tests -- the function
> will live on the same indian.el file as the defcustoms, so if the
> function is defined, the defcustoms are also bound, no?
>
IIUC, when we load indian.el, first, the vowel defcustom will be bound,
then the consonant defcustom and so on. So this boundp test is needed,
I think? See above for why the defcustoms have a "dependency" on each
other. When the vowel defcustom is loaded, then its job _sometimes_
depends on the consonant defcustom being bound as well.
I say sometimes because when we initially load the vowel defcustom,
having a separate setter should be fine but when we change it after
loading _all_ the other defcustoms (example in the Customize interface),
we also need to access the consonant translation values and update the
translation rules for consonant+vowel pairs. A big fat setter function
that does everything at the cost of boundp checks is simpler AFAIU.
>> I'm also doubtful about the current group being used for these
>> defcustoms. Should I go ahead and make a new 'tamil' group and make it
>> a subgroup of leim or i18n?
>
> It's okay to have a separate group, but what would be the subject of
> this group? If it's just about input methods, the name had better
> reflected that, and just "tamil" is too general for that.
>
I thought the subject could be "Translation rules for the Tamil input
method." If you think the group name is too general, then "tamil-im"
could work?
>> And is the prefix tamil- okay or should I change it to something
>> else?
>
> I see no problem with 'tamil-'.
>
Okay, thanks.
>> Finally, I'm unsure if "List of input sequences to translate to ..." is
>> clear. I think it sounds a mouthful and there should be a better way to
>> put it. I think "translation rules" is quite nice but I'm afraid that
>> it is too Quail specific and might not be well understood.
>
> I have no problem with that wording, but I wonder whether we should
> have these defcustoms in the first place. What are the chances that
> some user will want to change the sequences, and why would they want
> that?
I think the chances are quite high. As I tried to explain in the first
mail, there are too many ambiguities when transliterating Tamil and
sometimes there is no perfect transliteration for a character/consonant
family.
For example, the user in the wordpress article I linked chooses to
translate ல் as 'l' ள் as 'll' and take the penalty of having to type
C-SPC at the right time: to write ல்ல the sequence would l C-SPC la since
lla would translate to ள.
That user can take this penalty but I would rather translate ள் as L
instead and not worry about C-SPC at all.
Bottom line, there is no one size fits all. These small annoyances can
be dealt with when one writes Tamil rarely but for frequent writing, the
flexibility this input method offers will be welcome IMO.
The users _can_ update the quail-map themselves by hand but that becomes
tricky and a REAL chore for a language like Tamil.
[ FWIW, I add new translations and modify existing translations for the
compose input method by setf-ing its quail map. That is hard enough
already, and I definitely wouldn't wish someone to do it for the Tamil
input method. Offering a defcustom is the least we can do to ease the
pain of tweaking the translation rules. ]
> P.S. Please in the future don't modify the Subject of the messages in
> the same bug report: that makes it harder to find related messages at
> least when using Rmail.
Oops, sorry about that. I thought it would be easier to track the
progress but I guess it misfired.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 7:17 ` Eli Zaretskii
2022-07-02 7:35 ` Eli Zaretskii
@ 2022-07-02 8:11 ` Visuwesh
2022-07-02 8:29 ` Eli Zaretskii
1 sibling, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-02 8:11 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 56323@debbugs.gnu.org
>> Date: Sat, 02 Jul 2022 12:24:39 +0530
>>
>> [சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>>
>> >> There are two "classes" of consonants: those that are part of Tamil
>> >> (let's call them "core") and those borrowed from Sanskrit. When one
>> >> writes the consonants in order, the core consonants come first then the
>> >> Sanskrit ones. You can find the order of the core consonants in
>> >> wikipedia here in the table titled "Tamil consonants":
>> >> https://en.wikipedia.org/wiki/Tamil_script#Letters
>> >>
>> >> We need not worry too much about the order of Sanskrit consonants, we
>> >> just need to ensure that they come after the core consonants. You can
>> >> find these Sanskrit consonants in the table titled "Grantha consonants
>> >> in Tamil" in the same link.
>> >>
>> >> I hope this is clear.
>> >>
>> >> As for the criteria, it is simply "Tamil consonants then the Sanskrit
>> >> consonants."
>> >
>> > Then your comparison function should first see whether a character is
>> > in the former or the latter group, and use string-lessp or character
>> > codepoint comparison with each group, right? But that's not what you
>> > did, so I wonder whether my understanding is correct.
>>
>> It didn't occur to me to do it this way so I tried it out but then I
>> noticed, string-lessp even within a group won't work. When you evaluate
>> the following sexp, you don't get a list of increasing numbers...
>>
>> (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
>> "ந" "ப" "ம" "ய" "ர" "ல"
>> "வ" "ழ" "ள" "ற" "ன")))
>> (mapcar (lambda (c) (string-to-char c)) core-consonants))
>>
>> ;; => (2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992
>> 2994 2997 2996 2995 2993 2985)
>>
>> and sure enough when you do (sort core-consonants #'string-lessp) the
>> list is jumbled up instead of retaining the order.
>> [ core-consonants, as declared, is in the right order but sort jumbles
>> it up. ]
>>
>> But string-lessp works for vowels. It is the consonants that is the
>> problem.
>
> Sorry, I don't understand what you are saying here. How is the above
> code related to the issue at hand, which is how to sort characters in
> the order you want them to be sorted? (And please keep in mind that I
> don't even know which of those characters are consonants and which are
> vowels -- if you want me to say something intelligent about that.)
I'm trying to explain the behaviour of string-lessp which seems to sort
the characters by their Unicode codepoints. But the order these
characters appear in Unicode and their actual order is not the same so
string-lessp does not do the job we want it to.
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>
> Or maybe my guess below will be lucky. You probably want this:
>
> (defun sort-by-codepoint (c1 c2)
> (< (string-to-char c1) (string-to-char c2)))
>
> (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
> "ந" "ப" "ம" "ய" "ர" "ல"
> "வ" "ழ" "ள" "ற" "ன")))
>
> (sort core-consonants 'sort-by-codepoint))
> => ("க" "ங" "ச" "ஞ" "ட" "ண" "த" "ந" "ன" "ப" "ம" "ய" "ர" "ற" "ல" "ள" "ழ" "வ")
>
> (To understand why, read the doc string of 'sort' carefully, where it
> explains what is expected from PREDICATE.)
Unfortunately not, since it jumbles up the list. The desired outcome is
the same list.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 8:11 ` Visuwesh
@ 2022-07-02 8:29 ` Eli Zaretskii
2022-07-02 8:40 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 8:29 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sat, 02 Jul 2022 13:41:17 +0530
>
> > (defun sort-by-codepoint (c1 c2)
> > (< (string-to-char c1) (string-to-char c2)))
> >
> > (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
> > "ந" "ப" "ம" "ய" "ர" "ல"
> > "வ" "ழ" "ள" "ற" "ன")))
> >
> > (sort core-consonants 'sort-by-codepoint))
> > => ("க" "ங" "ச" "ஞ" "ட" "ண" "த" "ந" "ன" "ப" "ம" "ய" "ர" "ற" "ல" "ள" "ழ" "வ")
> >
> > (To understand why, read the doc string of 'sort' carefully, where it
> > explains what is expected from PREDICATE.)
>
> Unfortunately not, since it jumbles up the list. The desired outcome is
> the same list.
But we already established that you need to break the list in two, and
always sort any member of one of the two sub-lists before any member
of the other sub-list. I then suggested to use string-lessp _within_
each sub-list, but you said it still yielded a wrong order for some
reason.
So when you now return to the issue of splitting the list in two, and
show how sorting the full list doesn't work, you make a step back: we
already established the list cannot be sorted as a single list. The
only remaining issue, AFAIU, is why string-lessp is not good enough
for sorting within each sub-list.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 7:58 ` Visuwesh
@ 2022-07-02 8:39 ` Eli Zaretskii
2022-07-02 9:28 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 8:39 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sat, 02 Jul 2022 13:28:29 +0530
>
> >> Also, I would like to know if there's a better to write the :set
> >> function for the defcustoms tamil-vowel-translation,
> >> tamil-consonant-translation, tamil-misc-translation, tamil-native-digits
> >> without the boundp check chain below,
> >>
> >> (defun tamil--set-variable (sym val)
> >> (set-default sym val)
> >> (when (and (boundp 'tamil-vowel-translation)
> >> (boundp 'tamil-consonant-translation)
> >> (boundp 'tamil-misc-translation)
> >> (boundp 'tamil-native-digits))
> >> (tamil--update-quail-rules)))
> >
> > Why do you need a single function for all of them? Would a separate
> > setter function for each defcustom do the job?
> >
>
> Because it is harder to clear the old translation rules and add the new
> translation rules than clearing ALL translation rules and starting over
> again. When the user changes tamil-vowel-translation, then not only
> does the translation rule for the vowels change, we also need to change
> the translation rules for consonant+vowel pairs so that means we need to
> check if the consonant var is bound. (The translation rules for
> consonant+vowel pairs are auto-generated based on the rules for vowels
> and consonants.)
If the rules are generated based on both defcustom's, then shouldn't
we have just one defcustom for both? IOW, what is the purpose of
having two separate defcustom's here?
> > I also don't understand the need for the boundp tests -- the function
> > will live on the same indian.el file as the defcustoms, so if the
> > function is defined, the defcustoms are also bound, no?
> >
>
> IIUC, when we load indian.el, first, the vowel defcustom will be bound,
> then the consonant defcustom and so on. So this boundp test is needed,
> I think?
Wouldn't that be fixed by having the setter function defined before
the defcustom's?
> See above for why the defcustoms have a "dependency" on each
> other. When the vowel defcustom is loaded, then its job _sometimes_
> depends on the consonant defcustom being bound as well.
Since the defcustom's have their default value, I don't think I see
the problem. Did you actually see any problems, and if so, in which
scenario, and what were the error messages?
> I thought the subject could be "Translation rules for the Tamil input
> method." If you think the group name is too general, then "tamil-im"
> could work?
tamil-input, perhaps?
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 8:29 ` Eli Zaretskii
@ 2022-07-02 8:40 ` Visuwesh
2022-07-02 8:54 ` Eli Zaretskii
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-02 8:40 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 56323@debbugs.gnu.org
>> Date: Sat, 02 Jul 2022 13:41:17 +0530
>>
>> > (defun sort-by-codepoint (c1 c2)
>> > (< (string-to-char c1) (string-to-char c2)))
>> >
>> > (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
>> > "ந" "ப" "ம" "ய" "ர" "ல"
>> > "வ" "ழ" "ள" "ற" "ன")))
>> >
>> > (sort core-consonants 'sort-by-codepoint))
>> > => ("க" "ங" "ச" "ஞ" "ட" "ண" "த" "ந" "ன" "ப" "ம" "ய" "ர" "ற" "ல" "ள" "ழ" "வ")
>> >
>> > (To understand why, read the doc string of 'sort' carefully, where it
>> > explains what is expected from PREDICATE.)
>>
>> Unfortunately not, since it jumbles up the list. The desired outcome is
>> the same list.
>
> But we already established that you need to break the list in two, and
> always sort any member of one of the two sub-lists before any member
> of the other sub-list. I then suggested to use string-lessp _within_
> each sub-list, but you said it still yielded a wrong order for some
> reason.
>
Yes, I hope I made my point clear below.
> So when you now return to the issue of splitting the list in two, and
> show how sorting the full list doesn't work, you make a step back: we
> already established the list cannot be sorted as a single list.
I think I might not have made my point clear: the sort function above
sorts one of the sub-lists.
> The only remaining issue, AFAIU, is why string-lessp is not good
> enough for sorting within each sub-list.
It is not good enough for each sub-list for the same reason: the order
produced by string-lessp is not the same as the actual order.
I will try to explain the situation using the regular English alphabets
and the extra letter þ (which was used in place of "th" AFAIU).
The core English alphabets are a-z then we have some extra alphabets
like the þ above. When we have a list containing _both_ a-z and þ, the
order produced by string-lessp is wrong. To work around this issue, we
decided to break the list into two. I think we were on the same page
till here.
When I did as you suggested and broke the list into two -- a-z and þ --
and sorted the sub-list that only contained a-z with string-lessp, the
sorted sub-list was not in the right alphabetical order i.e., instead of
"a b c d ..." it was "a c b d ..."
I hope the above makes the situation clear.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 8:40 ` Visuwesh
@ 2022-07-02 8:54 ` Eli Zaretskii
2022-07-02 9:33 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 8:54 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sat, 02 Jul 2022 14:10:07 +0530
>
> > The only remaining issue, AFAIU, is why string-lessp is not good
> > enough for sorting within each sub-list.
>
> It is not good enough for each sub-list for the same reason: the order
> produced by string-lessp is not the same as the actual order.
So, then please explain what should be the "correct" order within each
sub-list. Is the correct order within each sub-list in the ascending
order of the codepoint? If not, what is the correct order?
> I will try to explain the situation using the regular English alphabets
> and the extra letter þ (which was used in place of "th" AFAIU).
>
> The core English alphabets are a-z then we have some extra alphabets
> like the þ above. When we have a list containing _both_ a-z and þ, the
> order produced by string-lessp is wrong.
>
> When I did as you suggested and broke the list into two -- a-z and þ --
> and sorted the sub-list that only contained a-z with string-lessp, the
> sorted sub-list was not in the right alphabetical order i.e., instead of
> "a b c d ..." it was "a c b d ..."
That's not what I see:
(let ((letters '("a" "b" "r" "x" "z")))
(sort letters 'string-lessp))
=> ("a" "b" "r" "x" "z")
Please show an example where characters a-z are sorted by string-lessp
in the wrong order.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 8:39 ` Eli Zaretskii
@ 2022-07-02 9:28 ` Visuwesh
2022-07-10 3:56 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-02 9:28 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> > Why do you need a single function for all of them? Would a separate
>> > setter function for each defcustom do the job?
>> >
>>
>> Because it is harder to clear the old translation rules and add the new
>> translation rules than clearing ALL translation rules and starting over
>> again. When the user changes tamil-vowel-translation, then not only
>> does the translation rule for the vowels change, we also need to change
>> the translation rules for consonant+vowel pairs so that means we need to
>> check if the consonant var is bound. (The translation rules for
>> consonant+vowel pairs are auto-generated based on the rules for vowels
>> and consonants.)
>
> If the rules are generated based on both defcustom's, then shouldn't
> we have just one defcustom for both? IOW, what is the purpose of
> having two separate defcustom's here?
>
It simply seemed natural to me to separate consonants and vowels. I
combined the three defcustoms (vowels, consonants and misc) as you told
but the native digits defcustom is still a problem... hmm. I can just
leave it to the user to add the native digit translations to the
defcustom if they want.
>> > I also don't understand the need for the boundp tests -- the function
>> > will live on the same indian.el file as the defcustoms, so if the
>> > function is defined, the defcustoms are also bound, no?
>> >
>>
>> IIUC, when we load indian.el, first, the vowel defcustom will be bound,
>> then the consonant defcustom and so on. So this boundp test is needed,
>> I think?
>
> Wouldn't that be fixed by having the setter function defined before
> the defcustom's?
>
>> See above for why the defcustoms have a "dependency" on each
>> other. When the vowel defcustom is loaded, then its job _sometimes_
>> depends on the consonant defcustom being bound as well.
>
> Since the defcustom's have their default value, I don't think I see
> the problem. Did you actually see any problems, and if so, in which
> scenario, and what were the error messages?
>
I was mostly worried about the tamil-native-digits defcustom but that
can be easily avoided.
>> I thought the subject could be "Translation rules for the Tamil input
>> method." If you think the group name is too general, then "tamil-im"
>> could work?
>
> tamil-input, perhaps?
Okay, then. That looks better to me as well.
I will post an updated patch later when I clean up the comments, and
docstrings. Thanks.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 8:54 ` Eli Zaretskii
@ 2022-07-02 9:33 ` Visuwesh
2022-07-02 9:38 ` Eli Zaretskii
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-02 9:33 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 56323@debbugs.gnu.org
>> Date: Sat, 02 Jul 2022 14:10:07 +0530
>>
>> > The only remaining issue, AFAIU, is why string-lessp is not good
>> > enough for sorting within each sub-list.
>>
>> It is not good enough for each sub-list for the same reason: the order
>> produced by string-lessp is not the same as the actual order.
>
> So, then please explain what should be the "correct" order within each
> sub-list. Is the correct order within each sub-list in the ascending
> order of the codepoint? If not, what is the correct order?
>
The correct order is not the ascending order of the codepoint, the
correct order is
க ங ச ஞ ட ண த ந ப ம ய ர ல வ ழ ள ற ன
and their respective codepoints are
2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992 2994 2997 2996 2995 2993 2985
>> I will try to explain the situation using the regular English alphabets
>> and the extra letter þ (which was used in place of "th" AFAIU).
>>
>> The core English alphabets are a-z then we have some extra alphabets
>> like the þ above. When we have a list containing _both_ a-z and þ, the
>> order produced by string-lessp is wrong.
>>
>> When I did as you suggested and broke the list into two -- a-z and þ --
>> and sorted the sub-list that only contained a-z with string-lessp, the
>> sorted sub-list was not in the right alphabetical order i.e., instead of
>> "a b c d ..." it was "a c b d ..."
>
> That's not what I see:
>
> (let ((letters '("a" "b" "r" "x" "z")))
> (sort letters 'string-lessp))
> => ("a" "b" "r" "x" "z")
>
> Please show an example where characters a-z are sorted by string-lessp
> in the wrong order.
I didn't mean literally that string-lessp produced the wrong list for
a-z, I tried to draw an analogy with a hypothetical scenario where a-z
sorting did not work with string-lessp. This hypothetical scenario is
the actual in case of the Tamil consonants.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 9:33 ` Visuwesh
@ 2022-07-02 9:38 ` Eli Zaretskii
2022-07-02 10:31 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 9:38 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sat, 02 Jul 2022 15:03:42 +0530
>
> > So, then please explain what should be the "correct" order within each
> > sub-list. Is the correct order within each sub-list in the ascending
> > order of the codepoint? If not, what is the correct order?
> >
>
> The correct order is not the ascending order of the codepoint, the
> correct order is
>
> க ங ச ஞ ட ண த ந ப ம ய ர ல வ ழ ள ற ன
>
> and their respective codepoints are
>
> 2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992 2994 2997 2996 2995 2993 2985
Why is this the correct order? Does it have any definition based on
some principles, not just on the above list?
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 9:38 ` Eli Zaretskii
@ 2022-07-02 10:31 ` Visuwesh
2022-07-02 10:46 ` Eli Zaretskii
2022-07-02 11:05 ` समीर सिंह Sameer Singh
0 siblings, 2 replies; 42+ messages in thread
From: Visuwesh @ 2022-07-02 10:31 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> > So, then please explain what should be the "correct" order within each
>> > sub-list. Is the correct order within each sub-list in the ascending
>> > order of the codepoint? If not, what is the correct order?
>> >
>>
>> The correct order is not the ascending order of the codepoint, the
>> correct order is
>>
>> க ங ச ஞ ட ண த ந ப ம ய ர ல வ ழ ள ற ன
>>
>> and their respective codepoints are
>>
>> 2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992 2994 2997 2996 2995 2993 2985
>
> Why is this the correct order? Does it have any definition based on
> some principles, not just on the above list?
I'm not sure if there is a principle behind it. Is there a principle
behind why a comes first after b? Same thing, I suppose. But it does
raise my brow when I see them out of order which is why I'm bothering to
sort them.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 10:31 ` Visuwesh
@ 2022-07-02 10:46 ` Eli Zaretskii
2022-07-02 12:08 ` Visuwesh
2022-07-02 11:05 ` समीर सिंह Sameer Singh
1 sibling, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 10:46 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sat, 02 Jul 2022 16:01:14 +0530
>
> [சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>
> >> > So, then please explain what should be the "correct" order within each
> >> > sub-list. Is the correct order within each sub-list in the ascending
> >> > order of the codepoint? If not, what is the correct order?
> >> >
> >>
> >> The correct order is not the ascending order of the codepoint, the
> >> correct order is
> >>
> >> க ங ச ஞ ட ண த ந ப ம ய ர ல வ ழ ள ற ன
> >>
> >> and their respective codepoints are
> >>
> >> 2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992 2994 2997 2996 2995 2993 2985
> >
> > Why is this the correct order? Does it have any definition based on
> > some principles, not just on the above list?
>
> I'm not sure if there is a principle behind it. Is there a principle
> behind why a comes first after b?
Yes: the codepoint order. There's no question about ordering when
it's according to the codepoints. If you want some other order, then
you need to define the rules for the order you want.
Is the order in which you want to sort the characters for Tamil
accepted somewhere, or is it your own preference? If the former,
where can one read about that order?
There was also another part to your original question about sorting,
AFAIR: you wanted to sort syllables, not just single characters.
Assuming the sorting order of the single characters is established in
some way, what is left to determine how to order syllables?
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 10:31 ` Visuwesh
2022-07-02 10:46 ` Eli Zaretskii
@ 2022-07-02 11:05 ` समीर सिंह Sameer Singh
2022-07-02 12:04 ` Visuwesh
2022-07-02 12:23 ` Eli Zaretskii
1 sibling, 2 replies; 42+ messages in thread
From: समीर सिंह Sameer Singh @ 2022-07-02 11:05 UTC (permalink / raw)
To: Visuwesh; +Cc: Eli Zaretskii, 56323
[-- Attachment #1.1: Type: text/plain, Size: 1775 bytes --]
There is indeed a principle behind the ordering of letters in Indian
languages taken from Sanskrit, and AFAICT Tamil also follows it.
க் ங்
ச் ஞ்
ட் ண்
த் ந்
ப் ம்
If we look at it rowwise, the first row is the velar consonants, then the
palatal then retroflex then dental then labial. If you notice here, we are
gradually moving from the back of the mouth to the front!
If we look at it columnwise the first column consists of unvoiced/voiced
consonants and the second column consists of nasals.
Then come the semivowels
ய் ர் ல் வ் ழ் ள்
After that
ற் ன்
शनि, 2 जुल॰ 2022, 4:02 pm को Visuwesh <visuweshm@gmail.com> ने लिखा:
> [சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>
> >> > So, then please explain what should be the "correct" order within each
> >> > sub-list. Is the correct order within each sub-list in the ascending
> >> > order of the codepoint? If not, what is the correct order?
> >> >
> >>
> >> The correct order is not the ascending order of the codepoint, the
> >> correct order is
> >>
> >> க ங ச ஞ ட ண த ந ப ம ய ர ல வ ழ ள ற ன
> >>
> >> and their respective codepoints are
> >>
> >> 2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992 2994 2997
> 2996 2995 2993 2985
> >
> > Why is this the correct order? Does it have any definition based on
> > some principles, not just on the above list?
>
> I'm not sure if there is a principle behind it. Is there a principle
> behind why a comes first after b? Same thing, I suppose. But it does
> raise my brow when I see them out of order which is why I'm bothering to
> sort them.
>
>
>
>
[-- Attachment #1.2: Type: text/html, Size: 2731 bytes --]
[-- Attachment #2: Screenshot_20220702-163431_Twitter.png --]
[-- Type: image/png, Size: 299713 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 11:05 ` समीर सिंह Sameer Singh
@ 2022-07-02 12:04 ` Visuwesh
2022-07-02 12:23 ` Eli Zaretskii
1 sibling, 0 replies; 42+ messages in thread
From: Visuwesh @ 2022-07-02 12:04 UTC (permalink / raw)
To: समीर सिंह Sameer Singh
Cc: Eli Zaretskii, 56323
[சனி ஜூலை 02, 2022] समीर सिंह Sameer Singh wrote:
> There is indeed a principle behind the ordering of letters in Indian
> languages taken from Sanskrit, and AFAICT Tamil also follows it.
>
> க் ங்
> ச் ஞ்
> ட் ண்
> த் ந்
> ப் ம்
>
> If we look at it rowwise, the first row is the velar consonants, then the
> palatal then retroflex then dental then labial. If you notice here, we are
> gradually moving from the back of the mouth to the front!
>
Aha! I never noticed this, thanks for this interesting info. It was
just an order for me just like A B C D ... etc.
> If we look at it columnwise the first column consists of unvoiced/voiced
> consonants and the second column consists of nasals.
>
> Then come the semivowels
> ய் ர் ல் வ் ழ் ள்
>
> After that
> ற் ன்
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 10:46 ` Eli Zaretskii
@ 2022-07-02 12:08 ` Visuwesh
0 siblings, 0 replies; 42+ messages in thread
From: Visuwesh @ 2022-07-02 12:08 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> I'm not sure if there is a principle behind it. Is there a principle
>> behind why a comes first after b?
>
> Yes: the codepoint order.
I meant the order in the English language, not the codepoints.
> There's no question about ordering when it's according to the
> codepoints. If you want some other order, then you need to define the
> rules for the order you want.
>
> Is the order in which you want to sort the characters for Tamil
> accepted somewhere, or is it your own preference? If the former, where
> can one read about that order?
>
It is the order followed by everyone. See the table titled "Tamil
consonants" in this wikipedia article
https://en.wikipedia.org/wiki/Tamil_script#Letters. If you want details
about the order, it will probably be not translated in English. I also
skimmed through the Tamil wikipedia and found nothing there.
> There was also another part to your original question about sorting,
> AFAIR: you wanted to sort syllables, not just single characters.
> Assuming the sorting order of the single characters is established in
> some way, what is left to determine how to order syllables?
The order of the syllables fall in place once we sort the consonants and
the vowels. Vowels can be sorted by using string-lessp so once we sort
the consonants, it is a simple matter of concatenation to produce the
table. (See my other email also.)
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-01 16:09 ` Eli Zaretskii
2022-07-01 16:37 ` Visuwesh
@ 2022-07-02 12:15 ` Visuwesh
2022-07-03 3:57 ` Visuwesh
1 sibling, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-02 12:15 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[-- Attachment #1: Type: text/plain, Size: 1081 bytes --]
[வெள்ளி ஜூலை 01, 2022] Eli Zaretskii wrote:
>> > Looks like simple misalignment to me, which should be cured by using
>> > pixel-resolution alignment features.
>>
>> Yep, it is misalignment. I could try to use those pixel-resolution
>> alignment features but I really don't think I can do a good enough job.
>> It is something I tried in the past but gave up since it was too complex
>> for me. The current code produces a Good Enough™ table and I think I
>> will just leave it unless Someone™ complains since after all, the
>> current situation is much better than what we have in Emacs 28 (the
>> docfix that happened as part of bug#50143 isn't in Emacs 28).
>
> I thought vtable.el was about solving such problems?
I tried to use vtable.el to produce the syllable table. There are two
problems:
. all the calculation done by vtable is slow (perhaps to no one's
surprise).
. the buffer becomes noticeably slow to scroll after the table is
inserted.
I've attached an elisp file of my current progress.
[-- Attachment #2: table.el --]
[-- Type: application/emacs-lisp, Size: 1552 bytes --]
[-- Attachment #3: Type: text/plain, Size: 149 bytes --]
When I commented out the make-vtable call and benchmarked it, it was
fast so it is not the creation of table data structure that is the
bottleneck.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 11:05 ` समीर सिंह Sameer Singh
2022-07-02 12:04 ` Visuwesh
@ 2022-07-02 12:23 ` Eli Zaretskii
1 sibling, 0 replies; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-02 12:23 UTC (permalink / raw)
To: समीर सिंह Sameer Singh
Cc: 56323, visuweshm
> From: समीर सिंह Sameer Singh <lumarzeli30@gmail.com>
> Date: Sat, 2 Jul 2022 16:35:51 +0530
> Cc: Eli Zaretskii <eliz@gnu.org>, 56323@debbugs.gnu.org
>
> There is indeed a principle behind the ordering of letters in Indian languages taken from Sanskrit, and AFAICT
> Tamil also follows it.
>
> க் ங்
> ச் ஞ்
> ட் ண்
> த் ந்
> ப் ம்
>
> If we look at it rowwise, the first row is the velar consonants, then the palatal then retroflex then dental then
> labial. If you notice here, we are gradually moving from the back of the mouth to the front!
>
> If we look at it columnwise the first column consists of unvoiced/voiced consonants and the second column
> consists of nasals.
>
> Then come the semivowels
> ய் ர் ல் வ் ழ் ள்
>
> After that
> ற் ன்
Thanks. If there's no existing property of characters that we could
use to produce this order, I guess we will need an alist of characters
and their ordinal numbers, and use that. Or, if the codepoints of
these characters are contiguous, we could have just the ordinal
numbers in the order of the codepoints, and use that in the function
passed as the PREDICATE argument to 'sort'.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 12:15 ` Visuwesh
@ 2022-07-03 3:57 ` Visuwesh
0 siblings, 0 replies; 42+ messages in thread
From: Visuwesh @ 2022-07-03 3:57 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[சனி ஜூலை 02, 2022] Visuwesh wrote:
> [வெள்ளி ஜூலை 01, 2022] Eli Zaretskii wrote:
>
>>> > Looks like simple misalignment to me, which should be cured by using
>>> > pixel-resolution alignment features.
>>>
>>> Yep, it is misalignment. I could try to use those pixel-resolution
>>> alignment features but I really don't think I can do a good enough job.
>>> It is something I tried in the past but gave up since it was too complex
>>> for me. The current code produces a Good Enough™ table and I think I
>>> will just leave it unless Someone™ complains since after all, the
>>> current situation is much better than what we have in Emacs 28 (the
>>> docfix that happened as part of bug#50143 isn't in Emacs 28).
>>
>> I thought vtable.el was about solving such problems?
>
> I tried to use vtable.el to produce the syllable table. There are two
> problems:
>
> . all the calculation done by vtable is slow (perhaps to no one's
> surprise).
> . the buffer becomes noticeably slow to scroll after the table is
> inserted.
Stripping the text-properties keymap, vtable, vtable-column and
vtable-object from the buffer text improved the performance of scrolling
substantially but it is still kind of sluggish.
> I've attached an elisp file of my current progress.
>
> When I commented out the make-vtable call and benchmarked it, it was
> fast so it is not the table data structure that is the bottleneck.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-02 9:28 ` Visuwesh
@ 2022-07-10 3:56 ` Visuwesh
2022-07-10 5:34 ` Eli Zaretskii
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-10 3:56 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[-- Attachment #1: Type: text/plain, Size: 179 bytes --]
[சனி ஜூலை 02, 2022] Visuwesh wrote:
> I will post an updated patch later when I clean up the comments, and
> docstrings. Thanks.
Here's an updated patch.
[-- Attachment #2: 0001-Add-new-customisable-phonetic-Tamil-input-method.patch --]
[-- Type: text/x-diff, Size: 17818 bytes --]
From 85a06306c0e2ca27365cf213b26ed526ae7a0b76 Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Sun, 10 Jul 2022 08:59:40 +0530
Subject: [PATCH] Add new customizable phonetic Tamil input method
* lisp/language/indian.el ("Tamil"): Change the default input method
of the Tamil language environment to the new input method.
* lisp/leim/quail/indian.el
(quail-tamil-itrans-compute-syllable-table): New function extracted
from...
(quail-tamil-itrans-syllable-table): ... here. Use the above
function.
(quail-tamil-itrans--consonant-order): Auxiliary variable for the
above function.
(quail-tamil-itrans-compute-signs-table): Add new VARIOUS argument.
(quail-tamil-itrans-various-signs-and-digits-table)
(quail-tamil-itrans-various-signs-table): Adjust call to the above
function.
("tamil"): Add new input method.
(tamil-input): New group for the input method.
(tamil-translation-rules): New defcustom for the input method to
change the translation rules.
(tamil--syllable-table, tamil--signs-table, tamil--hashtables)
(tamil--vowel-signs): Internal variables used by the input method.
(tamil--setter, tamil--make-tables)
(tamil--update-quail-rules): Internal functions for the input method.
(bug#56323)
* etc/NEWS: Announce the new input method.
---
etc/NEWS | 7 +
lisp/language/indian.el | 2 +-
lisp/leim/quail/indian.el | 305 +++++++++++++++++++++++++++++---------
3 files changed, 246 insertions(+), 68 deletions(-)
diff --git a/etc/NEWS b/etc/NEWS
index 02fe67129d..608b4d1110 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1043,6 +1043,13 @@ supported.
Type 'C-u C-h t' to select it in case your language setup does not do
so automatically.
+---
+*** New default phonetic input method for the Tamil language environment.
+The default input method for the Tamil language environment is now
+"tamil" which is a customizable phonetic input method. To change the
+input method's translation rules, customize the user option
+'tamil-translation-rules'.
+
\f
* Changes in Specialized Modes and Packages in Emacs 29.1
diff --git a/lisp/language/indian.el b/lisp/language/indian.el
index 2887d410ad..91ad818533 100644
--- a/lisp/language/indian.el
+++ b/lisp/language/indian.el
@@ -109,7 +109,7 @@ 'devanagari
"Tamil" '((charset unicode)
(coding-system utf-8)
(coding-priority utf-8)
- (input-method . "tamil-itrans")
+ (input-method . "tamil")
(sample-text . "Tamil (தமிழ்) வணக்கம்")
(documentation . "\
South Indian Language Tamil is supported in this language environment."))
diff --git a/lisp/leim/quail/indian.el b/lisp/leim/quail/indian.el
index 04e95b0737..336bd4e55a 100644
--- a/lisp/leim/quail/indian.el
+++ b/lisp/leim/quail/indian.el
@@ -127,47 +127,30 @@ "\\''"
indian-mlm-itrans-v5-hash "malayalam-itrans" "Malayalam" "MlmIT"
"Malayalam transliteration by ITRANS method.")
-(defvar quail-tamil-itrans-syllable-table
- (let ((vowels
- '(("அ" nil "a")
- ("ஆ" "ா" "A")
- ("இ" "ி" "i")
- ("ஈ" "ீ" "I")
- ("உ" "ு" "u")
- ("ஊ" "ூ" "U")
- ("எ" "ெ" "e")
- ("ஏ" "ே" "E")
- ("ஐ" "ை" "ai")
- ("ஒ" "ொ" "o")
- ("ஓ" "ோ" "O")
- ("ஔ" "ௌ" "au")))
- (consonants
- '(("க" "k") ; U+0B95
- ("ங" "N^") ; U+0B99
- ("ச" "ch") ; U+0B9A
- ("ஞ" "JN") ; U+0B9E
- ("ட" "T") ; U+0B9F
- ("ண" "N") ; U+0BA3
- ("த" "t") ; U+0BA4
- ("ந" "n") ; U+0BA8
- ("ப" "p") ; U+0BAA
- ("ம" "m") ; U+0BAE
- ("ய" "y") ; U+0BAF
- ("ர" "r") ; U+0BB0
- ("ல" "l") ; U+0BB2
- ("வ" "v") ; U+0BB5
- ("ழ" "z") ; U+0BB4
- ("ள" "L") ; U+0BB3
- ("ற" "rh") ; U+0BB1
- ("ன" "nh") ; U+0BA9
- ("ஜ" "j") ; U+0B9C
- ("ஶ" nil) ; U+0BB6
- ("ஷ" "Sh") ; U+0BB7
- ("ஸ" "s") ; U+0BB8
- ("ஹ" "h") ; U+0BB9
- ("க்ஷ" "x" ) ; U+0B95
- ))
- (virama #x0BCD)
+;; This is needed since the Unicode codepoint order does not reflect
+;; the actual order in the Tamil language.
+(defvar quail-tamil-itrans--consonant-order
+ '(("க" . 0) ("ங" . 1) ("ச" . 2) ("ஞ" . 3) ("ட" . 4) ("ண" . 5)
+ ("த" . 6) ("ந" . 7) ("ப" . 8) ("ம" . 9) ("ய" . 10) ("ர" . 11)
+ ("ல" . 12) ("வ" . 13) ("ழ" . 14) ("ள" . 15) ("ற" . 16) ("ன" . 17)
+ ("ஜ" . 18) ("ஸ" . 19) ("ஷ" . 20) ("ஹ" . 21) ("க்ஷ" . 22)
+ ("க்ஷ" . 23) ("ஶ" . 24)))
+
+(defun quail-tamil-itrans-compute-syllable-table (vowels consonants)
+ "Return the syllable table for the input method as a string.
+VOWELS is a list of (VOWEL SIGN TRANS) where VOWEL is a string or
+character representing the Tamil vowel character, SIGN is the
+vowel sign corresponding to VOWEL or nil for none, and TRANS is
+the input sequence to insert VOWEL.
+CONSONANTS is a list of (CONSONANT TRANS...) where CONSONANT is
+the Tamil consonant character, and TRANS is one or more strings
+that describe how to insert CONSONANT."
+ (setq vowels (sort vowels (lambda (x y) (string-lessp (car x) (car y))))
+ consonants (sort consonants
+ (lambda (x y)
+ (< (or (assoc-default (car x) quail-tamil-itrans--consonant-order) 10000)
+ (or (assoc-default (car y) quail-tamil-itrans--consonant-order) 10000)))))
+ (let ((virama #x0BCD)
clm)
(with-temp-buffer
(insert "\n")
@@ -197,21 +180,45 @@ quail-tamil-itrans-syllable-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car c) (or (nth 1 v) ""))
(setq clm (+ clm 6)))
- (insert "\n" (or (nth 1 c) "")
- (propertize "\t" 'display '(space :align-to 4))
- "|")
- (setq clm 6)
-
- (dolist (v vowels)
- (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
- (if (nth 1 c) (list (nth 1 c) (nth 2 v)) (list "")))
- (setq clm (+ clm 6))))
+ (dolist (ct (cdr c))
+ (insert "\n" (or ct "")
+ (propertize "\t" 'display '(space :align-to 4))
+ "|")
+ (setq clm 6)
+ (dolist (v vowels)
+ (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
+ (if ct (list ct (nth 2 v)) (list "")))
+ (setq clm (+ clm 6)))))
(insert "\n")
(insert "----+")
(insert-char ?- 74)
(insert "\n")
(buffer-string))))
+(defvar quail-tamil-itrans-syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (let ((vowels (car indian-tml-base-table))
+ trans v ret)
+ (dotimes (i (length vowels))
+ (when (setq v (nth i vowels))
+ (when (characterp (car v))
+ (setcar v (string (car v))))
+ (setq trans (nth i (car indian-itrans-v5-table-for-tamil)))
+ (push (append v (list (if (listp trans) (car trans) trans)))
+ ret)))
+ ret)
+ (let ((consonants (cadr indian-tml-base-table))
+ trans c ret)
+ (dotimes (i (length consonants))
+ (when (setq c (nth i consonants))
+ (when (characterp c)
+ (setq c (string c)))
+ (setq trans (nth i (cadr indian-itrans-v5-table-for-tamil)))
+ (push (cons c (if (listp trans) trans (list trans)))
+ ret)))
+ (setq ret (nreverse ret))
+ ret)))
+
(defvar quail-tamil-itrans-numerics-and-symbols-table
(let ((numerics '((?௰ "பத்து") (?௱ "நூறு") (?௲ "ஆயிரம்")))
(symbols '((?௳ "நாள்") (?௴ "மாதம்") (?௵ "வருடம்")
@@ -244,25 +251,28 @@ quail-tamil-itrans-numerics-and-symbols-table
(insert "\n")
(buffer-string))))
-(defun quail-tamil-itrans-compute-signs-table (digitp)
+(defun quail-tamil-itrans-compute-signs-table (digitp various)
"Compute the signs table for the tamil-itrans input method.
-If DIGITP is non-nil, include the digits translation as well."
- (let ((various '((?ஃ . "H") ("ஸ்ரீ" . "srii") (?ௐ)))
- (digits "௦௧௨௩௪௫௬௭௮௯")
+If DIGITP is non-nil, include the digits translation as well.
+If VARIOUS is non-nil, then it should a list of (CHAR TRANS)
+where CHAR is the character/string to translate and TRANS is
+CHAR's translation."
+ (let ((digits "௦௧௨௩௪௫௬௭௮௯")
(width 6) clm)
(with-temp-buffer
- (insert "\n" (make-string 18 ?-) "+")
- (when digitp (insert (make-string 60 ?-)))
+ (insert "\n" (make-string 18 ?-))
+ (when digitp
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(insert
(propertize "\t" 'display '(space :align-to 5)) "various"
- (propertize "\t" 'display '(space :align-to 18)) "|")
+ (propertize "\t" 'display '(space :align-to 18)))
(when digitp
(insert
- (propertize "\t" 'display '(space :align-to 45)) "digits"))
- (insert "\n" (make-string 18 ?-) "+")
+ "|" (propertize "\t" 'display '(space :align-to 45)) "digits"))
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-)))
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(setq clm 0)
@@ -270,7 +280,8 @@ quail-tamil-itrans-compute-signs-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car (nth i various)))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
@@ -281,25 +292,28 @@ quail-tamil-itrans-compute-signs-table
(setq clm 0)
(dotimes (i (length various))
(insert (propertize "\t" 'display (list 'space :align-to clm))
- (or (cdr (nth i various)) ""))
+ (or (cadr (nth i various)) ""))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
(insert (propertize "\t" 'display (list 'space :align-to clm))
(format "%d" i))
(setq clm (+ clm width))))
- (insert "\n" (make-string 18 ?-) "+")
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-) "\n"))
+ (insert "+" (make-string 60 ?-) "\n"))
(buffer-string))))
(defvar quail-tamil-itrans-various-signs-and-digits-table
- (quail-tamil-itrans-compute-signs-table t))
+ (quail-tamil-itrans-compute-signs-table
+ t '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(defvar quail-tamil-itrans-various-signs-table
- (quail-tamil-itrans-compute-signs-table nil))
+ (quail-tamil-itrans-compute-signs-table
+ nil '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(if nil
(quail-define-package "tamil-itrans" "Tamil" "TmlIT" t "Tamil ITRANS"))
@@ -347,6 +361,163 @@ quail-tamil-itrans-various-signs-table
Full key sequences are listed below:")
+;;;
+;;; Tamil phonetic input method
+;;;
+
+;; Define the input method straightaway.
+(quail-define-package "tamil" "Tamil" "ழ" t
+ "Customisable Tamil phonetic input method.
+To change the translation rules of the input method, customize
+`tamil-translation-rules'.
+
+To use native Tamil digits, customize `tamil-translation-rules'
+accordingly.
+
+To end the current translation process, say \\<quail-translation-keymap>\\[quail-select-current] (defined in
+`quail-translation-keymap'). This is useful when there's a
+conflict between two possible translation.
+
+The current input scheme is:
+
+### Basic syllables (உயிர்மெய் எழுத்துக்கள்) ###
+\\=\\<tamil--syllable-table>
+
+### Miscellaneous ####
+\\=\\<tamil--signs-table>
+
+The following characters have NO input sequence associated with
+them by default. Their descriptions are included for easy
+reference.
+\\=\\<quail-tamil-itrans-numerics-and-symbols-table>
+
+Full key sequences are listed below:"
+ nil nil nil nil nil nil t)
+
+(defvar tamil--syllable-table nil)
+(defvar tamil--signs-table nil)
+(defvar tamil--hashtables
+ (cons (make-hash-table :test #'equal)
+ (make-hash-table :test #'equal)))
+(defvar tamil--vowel-signs
+ '(("அ" . t) ("ஆ" . ?ா) ("இ" . ?ி) ("ஈ" . ?ீ)
+ ("உ" . ?ு) ("ஊ" . ?ூ) ("எ" . ?ெ) ("ஏ" . ?ே)
+ ("ஐ" . ?ை) ("ஒ" . ?ொ) ("ஓ" . ?ோ) ("ஔ" . ?ௌ)))
+
+(defun tamil--setter (sym val)
+ (set-default sym val)
+ (tamil--update-quail-rules val))
+
+(defun tamil--make-tables (rules)
+ (let (v v-table v-trans
+ c-table c-trans
+ m-table m-trans)
+ (dolist (ch rules)
+ (cond
+ ;; Vowel.
+ ((setq v (assoc-default (car ch) tamil--vowel-signs))
+ (push (list (car ch) (and (characterp v) v)) v-table)
+ (push (cdr ch) v-trans))
+ ;; Consonant. It needs to end with pulli.
+ ((string-suffix-p "்" (car ch))
+ ;; Strip the pulli now.
+ (push (substring (car ch) 0 -1) c-table)
+ (push (cdr ch) c-trans))
+ ;; If nothing else, then consider it a misc character.
+ (t (push (car ch) m-table)
+ (push (cdr ch) m-trans))))
+ (list v-table v-trans c-table c-trans m-table m-trans)))
+
+(defun tamil--update-quail-rules (rules &optional name)
+ ;; This function does pretty much what `indian-make-hash' does
+ ;; except that we don't try to copy the structure of
+ ;; `indian-tml-base-table' which leads to less code hassle.
+ (let* ((quail-current-package (assoc (or name "tamil") quail-package-alist))
+ (tables (tamil--make-tables rules))
+ (v (nth 0 tables))
+ (v-trans (nth 1 tables))
+ (c (nth 2 tables))
+ (c-trans (nth 3 tables))
+ (m (nth 4 tables))
+ (m-trans (nth 5 tables))
+ (pulli (string #x0BCD)))
+ (clrhash (car tamil--hashtables))
+ (clrhash (cdr tamil--hashtables))
+ (indian--puthash-v v v-trans tamil--hashtables)
+ (indian--puthash-c c c-trans pulli tamil--hashtables)
+ (indian--puthash-cv c c-trans v v-trans tamil--hashtables)
+ (indian--puthash-m m m-trans tamil--hashtables)
+ ;; Now override the current translation rules.
+ ;; Empty quail map is '(list nil)'.
+ (setf (nth 2 quail-current-package) '(nil))
+ (maphash (lambda (k v)
+ (quail-defrule k (if (length= v 1)
+ (string-to-char v)
+ (vector v))))
+ (cdr tamil--hashtables))
+ (setq tamil--syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (mapcar (lambda (ch) (append ch (pop v-trans))) v)
+ (mapcar (lambda (ch) (cons ch (pop c-trans))) c))
+ tamil--signs-table
+ (quail-tamil-itrans-compute-signs-table
+ nil
+ (append (mapcar (lambda (ch) (cons ch (pop m-trans))) m)
+ (and (gethash "ஸ்" (car tamil--hashtables))
+ `(("ஸ்ரீ" ,(concat (gethash "ஸ்" (car tamil--hashtables))
+ (gethash "ரீ" (car tamil--hashtables)))))))))))
+
+(defgroup tamil-input nil
+ "Translation rules for the Tamil input method."
+ :prefix "tamil-"
+ :group 'leim)
+
+(defcustom tamil-translation-rules
+ ;; Vowels.
+ '(("அ" "a") ("ஆ" "aa") ("இ" "i") ("ஈ" "ii")
+ ("உ" "u") ("ஊ" "uu") ("எ" "e") ("ஏ" "ee")
+ ("ஐ" "ai") ("ஒ" "o") ("ஓ" "oo") ("ஔ" "au" "ow")
+
+ ;; Consonants.
+ ("க்" "k" "g") ("ங்" "ng") ("ச்" "ch" "s") ("ஞ்" "nj") ("ட்" "t" "d")
+ ("ண்" "N") ("த்" "th" "dh") ("ந்" "nh") ("ப்" "p" "b") ("ம்" "m")
+ ("ய்" "y") ("ர்" "r") ("ல்" "l") ("வ்" "v") ("ழ்" "z" "zh")
+ ("ள்" "L") ("ற்" "rh") ("ன்" "n")
+ ;; Sanskrit.
+ ("ஜ்" "j") ("ஸ்" "S") ("ஷ்" "sh") ("ஹ்" "h")
+ ("க்ஷ்" "ksh") ("க்ஷ்" "ksH") ("ஶ்" "Z")
+
+ ;; Misc. ஃ is neither a consonant nor a vowel.
+ ("ஃ" "F" "q")
+ ("ௐ" "OM"))
+ "List of input sequences to translate to Tamil characters.
+Each element should be (CHARACTER . TRANSLATIONS) where CHARACTER
+is the Tamil character, and TRANSLATIONS is a list of input
+sequences to translate to that character.
+
+CHARACTER is considered as a consonant (மெய் எழுத்து) if it ends
+with a pulli.
+CHARACTER is that is neither a vowel nor a consonant are
+considered as \"miscellaneous\" characters and are inserted as
+is.
+
+The input sequence for consonant+vowel pairs (உயிர்மெய் எழுத்துக்கள்)
+is the input sequence for the consonant followed by the
+corresponding vowel."
+ :group 'tamil-input
+ :type '(alist :key-type string :value-type (repeat string))
+ :set #'tamil--setter
+ :options
+ (delq nil
+ (append (mapcar #'car tamil--vowel-signs)
+ (mapcar (lambda (x) (if (characterp x)
+ (string x #x0BCD)
+ (and x (concat x "்"))))
+ (nth 1 indian-tml-base-table))
+ '("ஃ" "ௐ")
+ ;; Digits.
+ (mapcar #'string (nth 3 indian-tml-base-digits-table)))))
+
;;;
;;; Input by Inscript
;;;
--
2.35.1
[-- Attachment #3: Type: text/plain, Size: 196 bytes --]
I don't use vtable since it is too slow. :(
[ Also, I don't see the customization group until I load
lisp/leim/quail/indian.el? But AFAICT, that's not the case for other
custom groups. ]
^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-10 3:56 ` Visuwesh
@ 2022-07-10 5:34 ` Eli Zaretskii
2022-07-10 6:42 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-10 5:34 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sun, 10 Jul 2022 09:26:39 +0530
>
> > I will post an updated patch later when I clean up the comments, and
> > docstrings. Thanks.
>
> Here's an updated patch.
Thanks.
> +---
> +*** New default phonetic input method for the Tamil language environment.
> +The default input method for the Tamil language environment is now
> +"tamil" which is a customizable phonetic input method. To change the
> +input method's translation rules, customize the user option
> +'tamil-translation-rules'.
> +
>
> * Changes in Specialized Modes and Packages in Emacs 29.1
>
> diff --git a/lisp/language/indian.el b/lisp/language/indian.el
> index 2887d410ad..91ad818533 100644
> --- a/lisp/language/indian.el
> +++ b/lisp/language/indian.el
> @@ -109,7 +109,7 @@ 'devanagari
> "Tamil" '((charset unicode)
> (coding-system utf-8)
> (coding-priority utf-8)
> - (input-method . "tamil-itrans")
> + (input-method . "tamil")
> (sample-text . "Tamil (தமிழ்) வணக்கம்")
> (documentation . "\
Please name the new input method "tamil-phonetic", not just "tamil",
so that users who type "C-u C-\ tamil TAB" could have some means of
making the decision which one to choose.
> +;; This is needed since the Unicode codepoint order does not reflect
> +;; the actual order in the Tamil language.
> +(defvar quail-tamil-itrans--consonant-order
> + '(("க" . 0) ("ங" . 1) ("ச" . 2) ("ஞ" . 3) ("ட" . 4) ("ண" . 5)
> + ("த" . 6) ("ந" . 7) ("ப" . 8) ("ம" . 9) ("ய" . 10) ("ர" . 11)
> + ("ல" . 12) ("வ" . 13) ("ழ" . 14) ("ள" . 15) ("ற" . 16) ("ன" . 17)
> + ("ஜ" . 18) ("ஸ" . 19) ("ஷ" . 20) ("ஹ" . 21) ("க்ஷ" . 22)
> + ("க்ஷ" . 23) ("ஶ" . 24)))
Since the characters are ordered in the correct order, I wonder why we
need the explicit ordinal numbers here: they are determined by the
index of the character in the list.
> +(defun quail-tamil-itrans-compute-syllable-table (vowels consonants)
> + "Return the syllable table for the input method as a string.
> +VOWELS is a list of (VOWEL SIGN TRANS) where VOWEL is a string or
> +character representing the Tamil vowel character, SIGN is the
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
What does it mean "character representing ... character"? Can you
clarify this confusing part of the doc string?
> +vowel sign corresponding to VOWEL or nil for none,
Likewise here: "vowel corresponding to VOWEL"?
> and TRANS is
> +the input sequence to insert VOWEL.
The input sequence is generally a sequence of ASCII characters, is
that right? If so, I think telling that would make the documentation
more clear. Also, TRANS is a peculiar name for something described as
"input sequence", so maybe rename it to INPUT-SEQ?
> +CONSONANTS is a list of (CONSONANT TRANS...) where CONSONANT is
> +the Tamil consonant character, and TRANS is one or more strings
> +that describe how to insert CONSONANT."
Same here regarding TRANS and its description.
> + (setq vowels (sort vowels (lambda (x y) (string-lessp (car x) (car y))))
> + consonants (sort consonants
> + (lambda (x y)
> + (< (or (assoc-default (car x) quail-tamil-itrans--consonant-order) 10000)
> + (or (assoc-default (car y) quail-tamil-itrans--consonant-order) 10000)))))
Can you wrap these long lines, so that they would be easier to read?
> + (let ((digits "௦௧௨௩௪௫௬௭௮௯")
> (width 6) clm)
> (with-temp-buffer
> - (insert "\n" (make-string 18 ?-) "+")
> - (when digitp (insert (make-string 60 ?-)))
> + (insert "\n" (make-string 18 ?-))
> + (when digitp
> + (insert "+" (make-string 60 ?-)))
> (insert "\n")
> (insert
> (propertize "\t" 'display '(space :align-to 5)) "various"
> - (propertize "\t" 'display '(space :align-to 18)) "|")
> + (propertize "\t" 'display '(space :align-to 18)))
> (when digitp
> (insert
> - (propertize "\t" 'display '(space :align-to 45)) "digits"))
> - (insert "\n" (make-string 18 ?-) "+")
> + "|" (propertize "\t" 'display '(space :align-to 45)) "digits"))
> + (insert "\n" (make-string 18 ?-))
Did you test those :align-to specs when display-line-numbers is in
use?
> +;;;
> +;;; Tamil phonetic input method
> +;;;
> +
> +;; Define the input method straightaway.
> +(quail-define-package "tamil" "Tamil" "ழ" t
> + "Customisable Tamil phonetic input method.
See above regarding the name of the input method.
> + ;; Consonants.
> + ("க்" "k" "g") ("ங்" "ng") ("ச்" "ch" "s") ("ஞ்" "nj") ("ட்" "t" "d")
> + ("ண்" "N") ("த்" "th" "dh") ("ந்" "nh") ("ப்" "p" "b") ("ம்" "m")
> + ("ய்" "y") ("ர்" "r") ("ல்" "l") ("வ்" "v") ("ழ்" "z" "zh")
> + ("ள்" "L") ("ற்" "rh") ("ன்" "n")
> + ;; Sanskrit.
> + ("ஜ்" "j") ("ஸ்" "S") ("ஷ்" "sh") ("ஹ்" "h")
> + ("க்ஷ்" "ksh") ("க்ஷ்" "ksH") ("ஶ்" "Z")
> +
> + ;; Misc. ஃ is neither a consonant nor a vowel.
> + ("ஃ" "F" "q")
> + ("ௐ" "OM"))
> + "List of input sequences to translate to Tamil characters.
> +Each element should be (CHARACTER . TRANSLATIONS) where CHARACTER
The (CHARACTER . TRANSLATIONS) form seems to imply the elements are
cons cells, but the value itself uses lists. Suggest to say instead
Each element should be (CHARACTER TRANSLATIONS...)
> +is the Tamil character, and TRANSLATIONS is a list of input
> +sequences to translate to that character.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"sequences which produce that character" is better. And I suggest to
use INPUT-SEQUENCES here, not TRANSLATIONS, for the reason explained
above.
> +CHARACTER is considered as a consonant (மெய் எழுத்து) if it ends
> +with a pulli.
What is a "pulli"? It is not a character name AFAICT.
> +CHARACTER is that is neither a vowel nor a consonant are
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Typo and/or redundant words here.
> +considered as \"miscellaneous\" characters and are inserted as
> +is.
Not sure what this wants to say: the fact that characters are inserted
in some way seems to be unrelated to the description of the value.
What is this about?
> +The input sequence for consonant+vowel pairs (உயிர்மெய் எழுத்துக்கள்)
> +is the input sequence for the consonant followed by the
> +corresponding vowel."
Isn't that obvious? If not, the non-obvious part(s) should be
mentioned explicitly.
> + :group 'tamil-input
> + :type '(alist :key-type string :value-type (repeat string))
> + :set #'tamil--setter
> + :options
This defcustom lacks the :version tag.
> [ Also, I don't see the customization group until I load
> lisp/leim/quail/indian.el? But AFAICT, that's not the case for other
> custom groups. ]
There are no defcustoms in leim/quail/ files. How about moving the
defcustom to lisp/language/indian.el?
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-10 5:34 ` Eli Zaretskii
@ 2022-07-10 6:42 ` Visuwesh
2022-07-10 7:32 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-10 6:42 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[-- Attachment #1: Type: text/plain, Size: 7192 bytes --]
[ஞாயிறு ஜூலை 10, 2022] Eli Zaretskii wrote:
> Please name the new input method "tamil-phonetic", not just "tamil",
> so that users who type "C-u C-\ tamil TAB" could have some means of
> making the decision which one to choose.
Done.
>> +;; This is needed since the Unicode codepoint order does not reflect
>> +;; the actual order in the Tamil language.
>> +(defvar quail-tamil-itrans--consonant-order
>> + '(("க" . 0) ("ங" . 1) ("ச" . 2) ("ஞ" . 3) ("ட" . 4) ("ண" . 5)
>> + ("த" . 6) ("ந" . 7) ("ப" . 8) ("ம" . 9) ("ய" . 10) ("ர" . 11)
>> + ("ல" . 12) ("வ" . 13) ("ழ" . 14) ("ள" . 15) ("ற" . 16) ("ன" . 17)
>> + ("ஜ" . 18) ("ஸ" . 19) ("ஷ" . 20) ("ஹ" . 21) ("க்ஷ" . 22)
>> + ("க்ஷ" . 23) ("ஶ" . 24)))
>
> Since the characters are ordered in the correct order, I wonder why we
> need the explicit ordinal numbers here: they are determined by the
> index of the character in the list.
Ah yes, we could use seq-position, I forgot about that. Now done.
>> +(defun quail-tamil-itrans-compute-syllable-table (vowels consonants)
>> + "Return the syllable table for the input method as a string.
>> +VOWELS is a list of (VOWEL SIGN TRANS) where VOWEL is a string or
>> +character representing the Tamil vowel character, SIGN is the
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> What does it mean "character representing ... character"? Can you
> clarify this confusing part of the doc string?
I mean to say that VOWEL can be the datatypes string or character. But
now, I cut that part out since I say no such thing for CONSONANT as
well.
>> +vowel sign corresponding to VOWEL or nil for none,
>
> Likewise here: "vowel corresponding to VOWEL"?
It should be vowel sign corresponding to VOWEL. I'm not sure how to
phrase it better, I borrowed the term "vowel sign" from the Unicode name
(e.g., name of ு a.k.a. #x0bc1).
>> and TRANS is
>> +the input sequence to insert VOWEL.
>
> The input sequence is generally a sequence of ASCII characters, is
> that right? If so, I think telling that would make the documentation
> more clear. Also, TRANS is a peculiar name for something described as
> "input sequence", so maybe rename it to INPUT-SEQ?
>
>> +CONSONANTS is a list of (CONSONANT TRANS...) where CONSONANT is
>> +the Tamil consonant character, and TRANS is one or more strings
>> +that describe how to insert CONSONANT."
>
> Same here regarding TRANS and its description.
Now done.
>> + (setq vowels (sort vowels (lambda (x y) (string-lessp (car x) (car y))))
>> + consonants (sort consonants
>> + (lambda (x y)
>> + (< (or (assoc-default (car x) quail-tamil-itrans--consonant-order) 10000)
>> + (or (assoc-default (car y) quail-tamil-itrans--consonant-order) 10000)))))
>
> Can you wrap these long lines, so that they would be easier to read?
I hope it is better now.
>> + (let ((digits "௦௧௨௩௪௫௬௭௮௯")
>> (width 6) clm)
>> (with-temp-buffer
>> - (insert "\n" (make-string 18 ?-) "+")
>> - (when digitp (insert (make-string 60 ?-)))
>> + (insert "\n" (make-string 18 ?-))
>> + (when digitp
>> + (insert "+" (make-string 60 ?-)))
>> (insert "\n")
>> (insert
>> (propertize "\t" 'display '(space :align-to 5)) "various"
>> - (propertize "\t" 'display '(space :align-to 18)) "|")
>> + (propertize "\t" 'display '(space :align-to 18)))
>> (when digitp
>> (insert
>> - (propertize "\t" 'display '(space :align-to 45)) "digits"))
>> - (insert "\n" (make-string 18 ?-) "+")
>> + "|" (propertize "\t" 'display '(space :align-to 45)) "digits"))
>> + (insert "\n" (make-string 18 ?-))
>
> Did you test those :align-to specs when display-line-numbers is in
> use?
Seems to work fine from a short test on my side.
>> +;;;
>> +;;; Tamil phonetic input method
>> +;;;
>> +
>> +;; Define the input method straightaway.
>> +(quail-define-package "tamil" "Tamil" "ழ" t
>> + "Customisable Tamil phonetic input method.
>
> See above regarding the name of the input method.
Done.
>> + ;; Consonants.
>> + ("க்" "k" "g") ("ங்" "ng") ("ச்" "ch" "s") ("ஞ்" "nj") ("ட்" "t" "d")
>> + ("ண்" "N") ("த்" "th" "dh") ("ந்" "nh") ("ப்" "p" "b") ("ம்" "m")
>> + ("ய்" "y") ("ர்" "r") ("ல்" "l") ("வ்" "v") ("ழ்" "z" "zh")
>> + ("ள்" "L") ("ற்" "rh") ("ன்" "n")
>> + ;; Sanskrit.
>> + ("ஜ்" "j") ("ஸ்" "S") ("ஷ்" "sh") ("ஹ்" "h")
>> + ("க்ஷ்" "ksh") ("க்ஷ்" "ksH") ("ஶ்" "Z")
>> +
>> + ;; Misc. ஃ is neither a consonant nor a vowel.
>> + ("ஃ" "F" "q")
>> + ("ௐ" "OM"))
>> + "List of input sequences to translate to Tamil characters.
>> +Each element should be (CHARACTER . TRANSLATIONS) where CHARACTER
>
> The (CHARACTER . TRANSLATIONS) form seems to imply the elements are
> cons cells, but the value itself uses lists. Suggest to say instead
>
> Each element should be (CHARACTER TRANSLATIONS...)
>
Done.
>> +is the Tamil character, and TRANSLATIONS is a list of input
>> +sequences to translate to that character.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> "sequences which produce that character" is better. And I suggest to
> use INPUT-SEQUENCES here, not TRANSLATIONS, for the reason explained
> above.
>
Done.
>> +CHARACTER is considered as a consonant (மெய் எழுத்து) if it ends
>> +with a pulli.
>
> What is a "pulli"? It is not a character name AFAICT.
>
It is the Tamil name for virama. I use pulli over virama since I don't
think any Tamil reader would know it. But I put virama in brackets now
for future maintainers.
>> +CHARACTER is that is neither a vowel nor a consonant are
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Typo and/or redundant words here.
>
Fixed, thanks.
>> +considered as \"miscellaneous\" characters and are inserted as
>> +is.
>
> Not sure what this wants to say: the fact that characters are inserted
> in some way seems to be unrelated to the description of the value.
> What is this about?
I tried to allude to the miscellaneous section in the docstring but I
don't think it is really necessary. Now removed.
>> +The input sequence for consonant+vowel pairs (உயிர்மெய் எழுத்துக்கள்)
>> +is the input sequence for the consonant followed by the
>> +corresponding vowel."
>
> Isn't that obvious? If not, the non-obvious part(s) should be
> mentioned explicitly.
Thinking twice, yes, it should be obvious. I removed this part.
>> + :group 'tamil-input
>> + :type '(alist :key-type string :value-type (repeat string))
>> + :set #'tamil--setter
>> + :options
>
> This defcustom lacks the :version tag.
>
Oops, now fixed.
Updated patch attached.
[-- Attachment #2: 0001-Add-new-customisable-phonetic-Tamil-input-method.patch --]
[-- Type: text/x-diff, Size: 17467 bytes --]
From bb9105e32b9ea992ade4ce4b15db89a5b56ba630 Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Sun, 10 Jul 2022 08:59:40 +0530
Subject: [PATCH] Add new customizable phonetic Tamil input method
* lisp/language/indian.el ("Tamil"): Change the default input method
of the Tamil language environment to the new input method.
* lisp/leim/quail/indian.el
(quail-tamil-itrans-compute-syllable-table): New function extracted
from...
(quail-tamil-itrans-syllable-table): ... here. Use the above
function.
(quail-tamil-itrans--consonant-order): Auxiliary variable for the
above function.
(quail-tamil-itrans-compute-signs-table): Add new VARIOUS argument.
(quail-tamil-itrans-various-signs-and-digits-table)
(quail-tamil-itrans-various-signs-table): Adjust call to the above
function.
("tamil-phonetic"): Add new input method.
(tamil-input): New group for the input method.
(tamil-translation-rules): New defcustom for the input method to
change the translation rules.
(tamil--syllable-table, tamil--signs-table, tamil--hashtables)
(tamil--vowel-signs): Internal variables used by the input method.
(tamil--setter, tamil--make-tables)
(tamil--update-quail-rules): Internal functions for the input method.
(bug#56323)
* etc/NEWS: Announce the new input method.
---
etc/NEWS | 7 +
lisp/language/indian.el | 2 +-
lisp/leim/quail/indian.el | 306 +++++++++++++++++++++++++++++---------
3 files changed, 247 insertions(+), 68 deletions(-)
diff --git a/etc/NEWS b/etc/NEWS
index 02fe67129d..33a489e18a 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1043,6 +1043,13 @@ supported.
Type 'C-u C-h t' to select it in case your language setup does not do
so automatically.
+---
+*** New default phonetic input method for the Tamil language environment.
+The default input method for the Tamil language environment is now
+"tamil-phonetic" which is a customizable phonetic input method. To
+change the input method's translation rules, customize the user option
+'tamil-translation-rules'.
+
\f
* Changes in Specialized Modes and Packages in Emacs 29.1
diff --git a/lisp/language/indian.el b/lisp/language/indian.el
index 2887d410ad..407173827f 100644
--- a/lisp/language/indian.el
+++ b/lisp/language/indian.el
@@ -109,7 +109,7 @@ 'devanagari
"Tamil" '((charset unicode)
(coding-system utf-8)
(coding-priority utf-8)
- (input-method . "tamil-itrans")
+ (input-method . "tamil-phonetic")
(sample-text . "Tamil (தமிழ்) வணக்கம்")
(documentation . "\
South Indian Language Tamil is supported in this language environment."))
diff --git a/lisp/leim/quail/indian.el b/lisp/leim/quail/indian.el
index 04e95b0737..b2f31d7469 100644
--- a/lisp/leim/quail/indian.el
+++ b/lisp/leim/quail/indian.el
@@ -127,47 +127,34 @@ "\\''"
indian-mlm-itrans-v5-hash "malayalam-itrans" "Malayalam" "MlmIT"
"Malayalam transliteration by ITRANS method.")
-(defvar quail-tamil-itrans-syllable-table
- (let ((vowels
- '(("அ" nil "a")
- ("ஆ" "ா" "A")
- ("இ" "ி" "i")
- ("ஈ" "ீ" "I")
- ("உ" "ு" "u")
- ("ஊ" "ூ" "U")
- ("எ" "ெ" "e")
- ("ஏ" "ே" "E")
- ("ஐ" "ை" "ai")
- ("ஒ" "ொ" "o")
- ("ஓ" "ோ" "O")
- ("ஔ" "ௌ" "au")))
- (consonants
- '(("க" "k") ; U+0B95
- ("ங" "N^") ; U+0B99
- ("ச" "ch") ; U+0B9A
- ("ஞ" "JN") ; U+0B9E
- ("ட" "T") ; U+0B9F
- ("ண" "N") ; U+0BA3
- ("த" "t") ; U+0BA4
- ("ந" "n") ; U+0BA8
- ("ப" "p") ; U+0BAA
- ("ம" "m") ; U+0BAE
- ("ய" "y") ; U+0BAF
- ("ர" "r") ; U+0BB0
- ("ல" "l") ; U+0BB2
- ("வ" "v") ; U+0BB5
- ("ழ" "z") ; U+0BB4
- ("ள" "L") ; U+0BB3
- ("ற" "rh") ; U+0BB1
- ("ன" "nh") ; U+0BA9
- ("ஜ" "j") ; U+0B9C
- ("ஶ" nil) ; U+0BB6
- ("ஷ" "Sh") ; U+0BB7
- ("ஸ" "s") ; U+0BB8
- ("ஹ" "h") ; U+0BB9
- ("க்ஷ" "x" ) ; U+0B95
- ))
- (virama #x0BCD)
+;; This is needed since the Unicode codepoint order does not reflect
+;; the actual order in the Tamil language.
+(defvar quail-tamil-itrans--consonant-order
+ '("க" "ங" "ச" "ஞ" "ட" "ண"
+ "த" "ந" "ப" "ம" "ய" "ர"
+ "ல" "வ" "ழ" "ள" "ற" "ன"
+ "ஜ" "ஸ" "ஷ" "ஹ" "க்ஷ"
+ "க்ஷ" "ஶ"))
+
+(defun quail-tamil-itrans-compute-syllable-table (vowels consonants)
+ "Return the syllable table for the input method as a string.
+VOWELS is a list of (VOWEL SIGN INPUT-SEQ) where VOWEL is the
+Tamil vowel character, SIGN is the vowel sign corresponding to
+that vowel character or nil for none, and INPUT-SEQ is the input
+sequence to insert VOWEL.
+
+CONSONANTS is a list of (CONSONANT INPUT-SEQ...) where CONSONANT
+is the Tamil consonant character, and INPUT-SEQ is one or more
+strings that describe how to insert CONSONANT."
+ (setq vowels (sort vowels
+ (lambda (x y)
+ (string-lessp (car x) (car y)))))
+ (setq consonants
+ (sort consonants
+ (lambda (x y)
+ (or (seq-position (car x) quail-tamil-itrans--consonant-order) 1000)
+ (or (seq-position (car y) quail-tamil-itrans--consonant-order) 1000))))
+ (let ((virama #x0BCD)
clm)
(with-temp-buffer
(insert "\n")
@@ -197,21 +184,45 @@ quail-tamil-itrans-syllable-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car c) (or (nth 1 v) ""))
(setq clm (+ clm 6)))
- (insert "\n" (or (nth 1 c) "")
- (propertize "\t" 'display '(space :align-to 4))
- "|")
- (setq clm 6)
-
- (dolist (v vowels)
- (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
- (if (nth 1 c) (list (nth 1 c) (nth 2 v)) (list "")))
- (setq clm (+ clm 6))))
+ (dolist (ct (cdr c))
+ (insert "\n" (or ct "")
+ (propertize "\t" 'display '(space :align-to 4))
+ "|")
+ (setq clm 6)
+ (dolist (v vowels)
+ (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
+ (if ct (list ct (nth 2 v)) (list "")))
+ (setq clm (+ clm 6)))))
(insert "\n")
(insert "----+")
(insert-char ?- 74)
(insert "\n")
(buffer-string))))
+(defvar quail-tamil-itrans-syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (let ((vowels (car indian-tml-base-table))
+ trans v ret)
+ (dotimes (i (length vowels))
+ (when (setq v (nth i vowels))
+ (when (characterp (car v))
+ (setcar v (string (car v))))
+ (setq trans (nth i (car indian-itrans-v5-table-for-tamil)))
+ (push (append v (list (if (listp trans) (car trans) trans)))
+ ret)))
+ ret)
+ (let ((consonants (cadr indian-tml-base-table))
+ trans c ret)
+ (dotimes (i (length consonants))
+ (when (setq c (nth i consonants))
+ (when (characterp c)
+ (setq c (string c)))
+ (setq trans (nth i (cadr indian-itrans-v5-table-for-tamil)))
+ (push (cons c (if (listp trans) trans (list trans)))
+ ret)))
+ (setq ret (nreverse ret))
+ ret)))
+
(defvar quail-tamil-itrans-numerics-and-symbols-table
(let ((numerics '((?௰ "பத்து") (?௱ "நூறு") (?௲ "ஆயிரம்")))
(symbols '((?௳ "நாள்") (?௴ "மாதம்") (?௵ "வருடம்")
@@ -244,25 +255,28 @@ quail-tamil-itrans-numerics-and-symbols-table
(insert "\n")
(buffer-string))))
-(defun quail-tamil-itrans-compute-signs-table (digitp)
+(defun quail-tamil-itrans-compute-signs-table (digitp various)
"Compute the signs table for the tamil-itrans input method.
-If DIGITP is non-nil, include the digits translation as well."
- (let ((various '((?ஃ . "H") ("ஸ்ரீ" . "srii") (?ௐ)))
- (digits "௦௧௨௩௪௫௬௭௮௯")
+If DIGITP is non-nil, include the digits translation as well.
+If VARIOUS is non-nil, then it should a list of (CHAR TRANS)
+where CHAR is the character/string to translate and TRANS is
+CHAR's translation."
+ (let ((digits "௦௧௨௩௪௫௬௭௮௯")
(width 6) clm)
(with-temp-buffer
- (insert "\n" (make-string 18 ?-) "+")
- (when digitp (insert (make-string 60 ?-)))
+ (insert "\n" (make-string 18 ?-))
+ (when digitp
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(insert
(propertize "\t" 'display '(space :align-to 5)) "various"
- (propertize "\t" 'display '(space :align-to 18)) "|")
+ (propertize "\t" 'display '(space :align-to 18)))
(when digitp
(insert
- (propertize "\t" 'display '(space :align-to 45)) "digits"))
- (insert "\n" (make-string 18 ?-) "+")
+ "|" (propertize "\t" 'display '(space :align-to 45)) "digits"))
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-)))
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(setq clm 0)
@@ -270,7 +284,8 @@ quail-tamil-itrans-compute-signs-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car (nth i various)))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
@@ -281,25 +296,28 @@ quail-tamil-itrans-compute-signs-table
(setq clm 0)
(dotimes (i (length various))
(insert (propertize "\t" 'display (list 'space :align-to clm))
- (or (cdr (nth i various)) ""))
+ (or (cadr (nth i various)) ""))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
(insert (propertize "\t" 'display (list 'space :align-to clm))
(format "%d" i))
(setq clm (+ clm width))))
- (insert "\n" (make-string 18 ?-) "+")
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-) "\n"))
+ (insert "+" (make-string 60 ?-) "\n"))
(buffer-string))))
(defvar quail-tamil-itrans-various-signs-and-digits-table
- (quail-tamil-itrans-compute-signs-table t))
+ (quail-tamil-itrans-compute-signs-table
+ t '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(defvar quail-tamil-itrans-various-signs-table
- (quail-tamil-itrans-compute-signs-table nil))
+ (quail-tamil-itrans-compute-signs-table
+ nil '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(if nil
(quail-define-package "tamil-itrans" "Tamil" "TmlIT" t "Tamil ITRANS"))
@@ -347,6 +365,160 @@ quail-tamil-itrans-various-signs-table
Full key sequences are listed below:")
+;;;
+;;; Tamil phonetic input method
+;;;
+
+;; Define the input method straightaway.
+(quail-define-package "tamil-phonetic" "Tamil" "ழ" t
+ "Customisable Tamil phonetic input method.
+To change the translation rules of the input method, customize
+`tamil-translation-rules'.
+
+To use native Tamil digits, customize `tamil-translation-rules'
+accordingly.
+
+To end the current translation process, say \\<quail-translation-keymap>\\[quail-select-current] (defined in
+`quail-translation-keymap'). This is useful when there's a
+conflict between two possible translation.
+
+The current input scheme is:
+
+### Basic syllables (உயிர்மெய் எழுத்துக்கள்) ###
+\\=\\<tamil--syllable-table>
+
+### Miscellaneous ####
+\\=\\<tamil--signs-table>
+
+The following characters have NO input sequence associated with
+them by default. Their descriptions are included for easy
+reference.
+\\=\\<quail-tamil-itrans-numerics-and-symbols-table>
+
+Full key sequences are listed below:"
+ nil nil nil nil nil nil t)
+
+(defvar tamil--syllable-table nil)
+(defvar tamil--signs-table nil)
+(defvar tamil--hashtables
+ (cons (make-hash-table :test #'equal)
+ (make-hash-table :test #'equal)))
+(defvar tamil--vowel-signs
+ '(("அ" . t) ("ஆ" . ?ா) ("இ" . ?ி) ("ஈ" . ?ீ)
+ ("உ" . ?ு) ("ஊ" . ?ூ) ("எ" . ?ெ) ("ஏ" . ?ே)
+ ("ஐ" . ?ை) ("ஒ" . ?ொ) ("ஓ" . ?ோ) ("ஔ" . ?ௌ)))
+
+(defun tamil--setter (sym val)
+ (set-default sym val)
+ (tamil--update-quail-rules val))
+
+(defun tamil--make-tables (rules)
+ (let (v v-table v-trans
+ c-table c-trans
+ m-table m-trans)
+ (dolist (ch rules)
+ (cond
+ ;; Vowel.
+ ((setq v (assoc-default (car ch) tamil--vowel-signs))
+ (push (list (car ch) (and (characterp v) v)) v-table)
+ (push (cdr ch) v-trans))
+ ;; Consonant. It needs to end with pulli.
+ ((string-suffix-p "்" (car ch))
+ ;; Strip the pulli now.
+ (push (substring (car ch) 0 -1) c-table)
+ (push (cdr ch) c-trans))
+ ;; If nothing else, then consider it a misc character.
+ (t (push (car ch) m-table)
+ (push (cdr ch) m-trans))))
+ (list v-table v-trans c-table c-trans m-table m-trans)))
+
+(defun tamil--update-quail-rules (rules &optional name)
+ ;; This function does pretty much what `indian-make-hash' does
+ ;; except that we don't try to copy the structure of
+ ;; `indian-tml-base-table' which leads to less code hassle.
+ (let* ((quail-current-package (assoc (or name "tamil-phonetic") quail-package-alist))
+ (tables (tamil--make-tables rules))
+ (v (nth 0 tables))
+ (v-trans (nth 1 tables))
+ (c (nth 2 tables))
+ (c-trans (nth 3 tables))
+ (m (nth 4 tables))
+ (m-trans (nth 5 tables))
+ (pulli (string #x0BCD)))
+ (clrhash (car tamil--hashtables))
+ (clrhash (cdr tamil--hashtables))
+ (indian--puthash-v v v-trans tamil--hashtables)
+ (indian--puthash-c c c-trans pulli tamil--hashtables)
+ (indian--puthash-cv c c-trans v v-trans tamil--hashtables)
+ (indian--puthash-m m m-trans tamil--hashtables)
+ ;; Now override the current translation rules.
+ ;; Empty quail map is '(list nil)'.
+ (setf (nth 2 quail-current-package) '(nil))
+ (maphash (lambda (k v)
+ (quail-defrule k (if (length= v 1)
+ (string-to-char v)
+ (vector v))))
+ (cdr tamil--hashtables))
+ (setq tamil--syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (mapcar (lambda (ch) (append ch (pop v-trans))) v)
+ (mapcar (lambda (ch) (cons ch (pop c-trans))) c))
+ tamil--signs-table
+ (quail-tamil-itrans-compute-signs-table
+ nil
+ (append (mapcar (lambda (ch) (cons ch (pop m-trans))) m)
+ (and (gethash "ஸ்" (car tamil--hashtables))
+ `(("ஸ்ரீ" ,(concat (gethash "ஸ்" (car tamil--hashtables))
+ (gethash "ரீ" (car tamil--hashtables)))))))))))
+
+(defgroup tamil-input nil
+ "Translation rules for the Tamil input method."
+ :prefix "tamil-"
+ :group 'leim)
+
+(defcustom tamil-translation-rules
+ ;; Vowels.
+ '(("அ" "a") ("ஆ" "aa") ("இ" "i") ("ஈ" "ii")
+ ("உ" "u") ("ஊ" "uu") ("எ" "e") ("ஏ" "ee")
+ ("ஐ" "ai") ("ஒ" "o") ("ஓ" "oo") ("ஔ" "au" "ow")
+
+ ;; Consonants.
+ ("க்" "k" "g") ("ங்" "ng") ("ச்" "ch" "s") ("ஞ்" "nj") ("ட்" "t" "d")
+ ("ண்" "N") ("த்" "th" "dh") ("ந்" "nh") ("ப்" "p" "b") ("ம்" "m")
+ ("ய்" "y") ("ர்" "r") ("ல்" "l") ("வ்" "v") ("ழ்" "z" "zh")
+ ("ள்" "L") ("ற்" "rh") ("ன்" "n")
+ ;; Sanskrit.
+ ("ஜ்" "j") ("ஸ்" "S") ("ஷ்" "sh") ("ஹ்" "h")
+ ("க்ஷ்" "ksh") ("க்ஷ்" "ksH") ("ஶ்" "Z")
+
+ ;; Misc. ஃ is neither a consonant nor a vowel.
+ ("ஃ" "F" "q")
+ ("ௐ" "OM"))
+ "List of input sequences to translate to Tamil characters.
+Each element should be (CHARACTER INPUT-SEQ...) where CHARACTER
+is the Tamil character, and INPUT-SEQ is a list of input
+sequences to translate to that character.
+
+CHARACTER is considered as a consonant (மெய் எழுத்து) if it ends
+with a pulli (virama).
+
+CHARACTER that is neither a vowel nor a consonant are inserted as
+is."
+ :group 'tamil-input
+ :type '(alist :key-type string :value-type (repeat string))
+ :set #'tamil--setter
+ :version "29.1"
+ :options
+ (delq nil
+ (append (mapcar #'car tamil--vowel-signs)
+ (mapcar (lambda (x) (if (characterp x)
+ (string x #x0BCD)
+ (and x (concat x "்"))))
+ (nth 1 indian-tml-base-table))
+ '("ஃ" "ௐ")
+ ;; Digits.
+ (mapcar #'string (nth 3 indian-tml-base-digits-table)))))
+
;;;
;;; Input by Inscript
;;;
--
2.35.1
[-- Attachment #3: Type: text/plain, Size: 2602 bytes --]
>> [ Also, I don't see the customization group until I load
>> lisp/leim/quail/indian.el? But AFAICT, that's not the case for other
>> custom groups. ]
>
> There are no defcustoms in leim/quail/ files. How about moving the
> defcustom to lisp/language/indian.el?
Hmm, moving it to lisp/language/indian.el brings in warnings about
undefined vars and functions, and an error when dumping.
In toplevel form:
language/indian.el:147:31: Warning: reference to free variable ‘tamil--vowel-signs’
language/indian.el:151:32: Warning: reference to free variable ‘indian-tml-base-table’
language/indian.el:154:41: Warning: reference to free variable ‘indian-tml-base-digits-table’
In end of data:
language/indian.el:143:10: Warning: the function ‘tamil--setter’ is not known to be defined.
rm -f emacs && cp -f temacs emacs
LC_ALL=C ./temacs -batch -l loadup --temacs=pdump \
--bin-dest /usr/local/bin/ --eln-dest /usr/local/lib/emacs/29.0.50/
Loading loadup.el (source)...
Dump mode: pdump
Using load-path (/home/viz/lib/ports/emacs/lisp)
Loading emacs-lisp/debug-early...
Loading emacs-lisp/byte-run...
Loading emacs-lisp/backquote...
Loading subr...
Loading keymap...
Loading version...
Loading widget...
Loading custom...
Loading emacs-lisp/map-ynp...
Loading international/mule...
Loading international/mule-conf...
Loading env...
Loading format...
Loading bindings...
Loading window...
Loading files...
Loading emacs-lisp/macroexp...
Loading cus-face...
Loading faces...
Loading loaddefs.el (source)...
Loading button...
Loading emacs-lisp/cl-preloaded...
Loading emacs-lisp/oclosure...
Loading obarray...
Loading abbrev...
Loading help...
Loading jka-cmpr-hook...
Loading epa-hook...
Loading international/mule-cmds...
Loading case-table...
Loading international/charprop.el (source)...
Loading international/characters...
Loading international/charscript...
Loading international/emoji-zwj...
Loading composite...
Loading language/chinese...
Loading language/cyrillic...
Loading language/indian...
Error: void-variable (tamil--vowel-signs)
(require cl-print) while preparing to dump
make[1]: *** [Makefile:639: emacs.pdmp] Error 255
make[1]: Leaving directory '/home/viz/lib/ports/emacs/src'
make: *** [Makefile:469: src] Error 2
Should I stick in defvar's and declare-function's?
^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-10 6:42 ` Visuwesh
@ 2022-07-10 7:32 ` Visuwesh
2022-07-14 6:34 ` Eli Zaretskii
0 siblings, 1 reply; 42+ messages in thread
From: Visuwesh @ 2022-07-10 7:32 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 56323
[-- Attachment #1: Type: text/plain, Size: 245 bytes --]
[ஞாயிறு ஜூலை 10, 2022] Visuwesh wrote:
> [ஞாயிறு ஜூலை 10, 2022] Eli Zaretskii wrote:
>
> Updated patch attached.
>
I managed to miss a comment, sorry about that. Now fixed in attached
patch.
[-- Attachment #2: 0001-Add-new-customisable-phonetic-Tamil-input-method.patch --]
[-- Type: text/x-diff, Size: 17477 bytes --]
From 9fe05b79cadd69d9d2d507c4bb491e1e8b3755d9 Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Sun, 10 Jul 2022 08:59:40 +0530
Subject: [PATCH] Add new customizable phonetic Tamil input method
* lisp/language/indian.el ("Tamil"): Change the default input method
of the Tamil language environment to the new input method.
* lisp/leim/quail/indian.el
(quail-tamil-itrans-compute-syllable-table): New function extracted
from...
(quail-tamil-itrans-syllable-table): ... here. Use the above
function.
(quail-tamil-itrans--consonant-order): Auxiliary variable for the
above function.
(quail-tamil-itrans-compute-signs-table): Add new VARIOUS argument.
(quail-tamil-itrans-various-signs-and-digits-table)
(quail-tamil-itrans-various-signs-table): Adjust call to the above
function.
("tamil-phonetic"): Add new input method.
(tamil-input): New group for the input method.
(tamil-translation-rules): New defcustom for the input method to
change the translation rules.
(tamil--syllable-table, tamil--signs-table, tamil--hashtables)
(tamil--vowel-signs): Internal variables used by the input method.
(tamil--setter, tamil--make-tables)
(tamil--update-quail-rules): Internal functions for the input method.
(bug#56323)
* etc/NEWS: Announce the new input method.
---
etc/NEWS | 7 +
lisp/language/indian.el | 2 +-
lisp/leim/quail/indian.el | 306 +++++++++++++++++++++++++++++---------
3 files changed, 247 insertions(+), 68 deletions(-)
diff --git a/etc/NEWS b/etc/NEWS
index 02fe67129d..33a489e18a 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1043,6 +1043,13 @@ supported.
Type 'C-u C-h t' to select it in case your language setup does not do
so automatically.
+---
+*** New default phonetic input method for the Tamil language environment.
+The default input method for the Tamil language environment is now
+"tamil-phonetic" which is a customizable phonetic input method. To
+change the input method's translation rules, customize the user option
+'tamil-translation-rules'.
+
\f
* Changes in Specialized Modes and Packages in Emacs 29.1
diff --git a/lisp/language/indian.el b/lisp/language/indian.el
index 2887d410ad..407173827f 100644
--- a/lisp/language/indian.el
+++ b/lisp/language/indian.el
@@ -109,7 +109,7 @@ 'devanagari
"Tamil" '((charset unicode)
(coding-system utf-8)
(coding-priority utf-8)
- (input-method . "tamil-itrans")
+ (input-method . "tamil-phonetic")
(sample-text . "Tamil (தமிழ்) வணக்கம்")
(documentation . "\
South Indian Language Tamil is supported in this language environment."))
diff --git a/lisp/leim/quail/indian.el b/lisp/leim/quail/indian.el
index 04e95b0737..62836f3131 100644
--- a/lisp/leim/quail/indian.el
+++ b/lisp/leim/quail/indian.el
@@ -127,47 +127,34 @@ "\\''"
indian-mlm-itrans-v5-hash "malayalam-itrans" "Malayalam" "MlmIT"
"Malayalam transliteration by ITRANS method.")
-(defvar quail-tamil-itrans-syllable-table
- (let ((vowels
- '(("அ" nil "a")
- ("ஆ" "ா" "A")
- ("இ" "ி" "i")
- ("ஈ" "ீ" "I")
- ("உ" "ு" "u")
- ("ஊ" "ூ" "U")
- ("எ" "ெ" "e")
- ("ஏ" "ே" "E")
- ("ஐ" "ை" "ai")
- ("ஒ" "ொ" "o")
- ("ஓ" "ோ" "O")
- ("ஔ" "ௌ" "au")))
- (consonants
- '(("க" "k") ; U+0B95
- ("ங" "N^") ; U+0B99
- ("ச" "ch") ; U+0B9A
- ("ஞ" "JN") ; U+0B9E
- ("ட" "T") ; U+0B9F
- ("ண" "N") ; U+0BA3
- ("த" "t") ; U+0BA4
- ("ந" "n") ; U+0BA8
- ("ப" "p") ; U+0BAA
- ("ம" "m") ; U+0BAE
- ("ய" "y") ; U+0BAF
- ("ர" "r") ; U+0BB0
- ("ல" "l") ; U+0BB2
- ("வ" "v") ; U+0BB5
- ("ழ" "z") ; U+0BB4
- ("ள" "L") ; U+0BB3
- ("ற" "rh") ; U+0BB1
- ("ன" "nh") ; U+0BA9
- ("ஜ" "j") ; U+0B9C
- ("ஶ" nil) ; U+0BB6
- ("ஷ" "Sh") ; U+0BB7
- ("ஸ" "s") ; U+0BB8
- ("ஹ" "h") ; U+0BB9
- ("க்ஷ" "x" ) ; U+0B95
- ))
- (virama #x0BCD)
+;; This is needed since the Unicode codepoint order does not reflect
+;; the actual order in the Tamil language.
+(defvar quail-tamil-itrans--consonant-order
+ '("க" "ங" "ச" "ஞ" "ட" "ண"
+ "த" "ந" "ப" "ம" "ய" "ர"
+ "ல" "வ" "ழ" "ள" "ற" "ன"
+ "ஜ" "ஸ" "ஷ" "ஹ" "க்ஷ"
+ "க்ஷ" "ஶ"))
+
+(defun quail-tamil-itrans-compute-syllable-table (vowels consonants)
+ "Return the syllable table for the input method as a string.
+VOWELS is a list of (VOWEL SIGN INPUT-SEQ) where VOWEL is the
+Tamil vowel character, SIGN is the vowel sign corresponding to
+that vowel character or nil for none, and INPUT-SEQ is the input
+sequence to insert VOWEL.
+
+CONSONANTS is a list of (CONSONANT INPUT-SEQ...) where CONSONANT
+is the Tamil consonant character, and INPUT-SEQ is one or more
+strings that describe how to insert CONSONANT."
+ (setq vowels (sort vowels
+ (lambda (x y)
+ (string-lessp (car x) (car y)))))
+ (setq consonants
+ (sort consonants
+ (lambda (x y)
+ (or (seq-position (car x) quail-tamil-itrans--consonant-order) 1000)
+ (or (seq-position (car y) quail-tamil-itrans--consonant-order) 1000))))
+ (let ((virama #x0BCD)
clm)
(with-temp-buffer
(insert "\n")
@@ -197,21 +184,45 @@ quail-tamil-itrans-syllable-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car c) (or (nth 1 v) ""))
(setq clm (+ clm 6)))
- (insert "\n" (or (nth 1 c) "")
- (propertize "\t" 'display '(space :align-to 4))
- "|")
- (setq clm 6)
-
- (dolist (v vowels)
- (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
- (if (nth 1 c) (list (nth 1 c) (nth 2 v)) (list "")))
- (setq clm (+ clm 6))))
+ (dolist (ct (cdr c))
+ (insert "\n" (or ct "")
+ (propertize "\t" 'display '(space :align-to 4))
+ "|")
+ (setq clm 6)
+ (dolist (v vowels)
+ (apply #'insert (propertize "\t" 'display (list 'space :align-to clm))
+ (if ct (list ct (nth 2 v)) (list "")))
+ (setq clm (+ clm 6)))))
(insert "\n")
(insert "----+")
(insert-char ?- 74)
(insert "\n")
(buffer-string))))
+(defvar quail-tamil-itrans-syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (let ((vowels (car indian-tml-base-table))
+ trans v ret)
+ (dotimes (i (length vowels))
+ (when (setq v (nth i vowels))
+ (when (characterp (car v))
+ (setcar v (string (car v))))
+ (setq trans (nth i (car indian-itrans-v5-table-for-tamil)))
+ (push (append v (list (if (listp trans) (car trans) trans)))
+ ret)))
+ ret)
+ (let ((consonants (cadr indian-tml-base-table))
+ trans c ret)
+ (dotimes (i (length consonants))
+ (when (setq c (nth i consonants))
+ (when (characterp c)
+ (setq c (string c)))
+ (setq trans (nth i (cadr indian-itrans-v5-table-for-tamil)))
+ (push (cons c (if (listp trans) trans (list trans)))
+ ret)))
+ (setq ret (nreverse ret))
+ ret)))
+
(defvar quail-tamil-itrans-numerics-and-symbols-table
(let ((numerics '((?௰ "பத்து") (?௱ "நூறு") (?௲ "ஆயிரம்")))
(symbols '((?௳ "நாள்") (?௴ "மாதம்") (?௵ "வருடம்")
@@ -244,25 +255,28 @@ quail-tamil-itrans-numerics-and-symbols-table
(insert "\n")
(buffer-string))))
-(defun quail-tamil-itrans-compute-signs-table (digitp)
+(defun quail-tamil-itrans-compute-signs-table (digitp various)
"Compute the signs table for the tamil-itrans input method.
-If DIGITP is non-nil, include the digits translation as well."
- (let ((various '((?ஃ . "H") ("ஸ்ரீ" . "srii") (?ௐ)))
- (digits "௦௧௨௩௪௫௬௭௮௯")
+If DIGITP is non-nil, include the digits translation as well.
+If VARIOUS is non-nil, then it should a list of (CHAR TRANS)
+where CHAR is the character/string to translate and TRANS is
+CHAR's translation."
+ (let ((digits "௦௧௨௩௪௫௬௭௮௯")
(width 6) clm)
(with-temp-buffer
- (insert "\n" (make-string 18 ?-) "+")
- (when digitp (insert (make-string 60 ?-)))
+ (insert "\n" (make-string 18 ?-))
+ (when digitp
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(insert
(propertize "\t" 'display '(space :align-to 5)) "various"
- (propertize "\t" 'display '(space :align-to 18)) "|")
+ (propertize "\t" 'display '(space :align-to 18)))
(when digitp
(insert
- (propertize "\t" 'display '(space :align-to 45)) "digits"))
- (insert "\n" (make-string 18 ?-) "+")
+ "|" (propertize "\t" 'display '(space :align-to 45)) "digits"))
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-)))
+ (insert "+" (make-string 60 ?-)))
(insert "\n")
(setq clm 0)
@@ -270,7 +284,8 @@ quail-tamil-itrans-compute-signs-table
(insert (propertize "\t" 'display (list 'space :align-to clm))
(car (nth i various)))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
@@ -281,25 +296,28 @@ quail-tamil-itrans-compute-signs-table
(setq clm 0)
(dotimes (i (length various))
(insert (propertize "\t" 'display (list 'space :align-to clm))
- (or (cdr (nth i various)) ""))
+ (or (cadr (nth i various)) ""))
(setq clm (+ clm width)))
- (insert (propertize "\t" 'display '(space :align-to 18)) "|")
+ (when digitp
+ (insert (propertize "\t" 'display '(space :align-to 18)) "|"))
(setq clm 20)
(when digitp
(dotimes (i 10)
(insert (propertize "\t" 'display (list 'space :align-to clm))
(format "%d" i))
(setq clm (+ clm width))))
- (insert "\n" (make-string 18 ?-) "+")
+ (insert "\n" (make-string 18 ?-))
(when digitp
- (insert (make-string 60 ?-) "\n"))
+ (insert "+" (make-string 60 ?-) "\n"))
(buffer-string))))
(defvar quail-tamil-itrans-various-signs-and-digits-table
- (quail-tamil-itrans-compute-signs-table t))
+ (quail-tamil-itrans-compute-signs-table
+ t '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(defvar quail-tamil-itrans-various-signs-table
- (quail-tamil-itrans-compute-signs-table nil))
+ (quail-tamil-itrans-compute-signs-table
+ nil '((?ஃ "H") ("ஸ்ரீ" "srii") (?ௐ "OM"))))
(if nil
(quail-define-package "tamil-itrans" "Tamil" "TmlIT" t "Tamil ITRANS"))
@@ -347,6 +365,160 @@ quail-tamil-itrans-various-signs-table
Full key sequences are listed below:")
+;;;
+;;; Tamil phonetic input method
+;;;
+
+;; Define the input method straightaway.
+(quail-define-package "tamil-phonetic" "Tamil" "ழ" t
+ "Customisable Tamil phonetic input method.
+To change the translation rules of the input method, customize
+`tamil-translation-rules'.
+
+To use native Tamil digits, customize `tamil-translation-rules'
+accordingly.
+
+To end the current translation process, say \\<quail-translation-keymap>\\[quail-select-current] (defined in
+`quail-translation-keymap'). This is useful when there's a
+conflict between two possible translation.
+
+The current input scheme is:
+
+### Basic syllables (உயிர்மெய் எழுத்துக்கள்) ###
+\\=\\<tamil--syllable-table>
+
+### Miscellaneous ####
+\\=\\<tamil--signs-table>
+
+The following characters have NO input sequence associated with
+them by default. Their descriptions are included for easy
+reference.
+\\=\\<quail-tamil-itrans-numerics-and-symbols-table>
+
+Full key sequences are listed below:"
+ nil nil nil nil nil nil t)
+
+(defvar tamil--syllable-table nil)
+(defvar tamil--signs-table nil)
+(defvar tamil--hashtables
+ (cons (make-hash-table :test #'equal)
+ (make-hash-table :test #'equal)))
+(defvar tamil--vowel-signs
+ '(("அ" . t) ("ஆ" . ?ா) ("இ" . ?ி) ("ஈ" . ?ீ)
+ ("உ" . ?ு) ("ஊ" . ?ூ) ("எ" . ?ெ) ("ஏ" . ?ே)
+ ("ஐ" . ?ை) ("ஒ" . ?ொ) ("ஓ" . ?ோ) ("ஔ" . ?ௌ)))
+
+(defun tamil--setter (sym val)
+ (set-default sym val)
+ (tamil--update-quail-rules val))
+
+(defun tamil--make-tables (rules)
+ (let (v v-table v-trans
+ c-table c-trans
+ m-table m-trans)
+ (dolist (ch rules)
+ (cond
+ ;; Vowel.
+ ((setq v (assoc-default (car ch) tamil--vowel-signs))
+ (push (list (car ch) (and (characterp v) v)) v-table)
+ (push (cdr ch) v-trans))
+ ;; Consonant. It needs to end with pulli.
+ ((string-suffix-p "்" (car ch))
+ ;; Strip the pulli now.
+ (push (substring (car ch) 0 -1) c-table)
+ (push (cdr ch) c-trans))
+ ;; If nothing else, then consider it a misc character.
+ (t (push (car ch) m-table)
+ (push (cdr ch) m-trans))))
+ (list v-table v-trans c-table c-trans m-table m-trans)))
+
+(defun tamil--update-quail-rules (rules &optional name)
+ ;; This function does pretty much what `indian-make-hash' does
+ ;; except that we don't try to copy the structure of
+ ;; `indian-tml-base-table' which leads to less code hassle.
+ (let* ((quail-current-package (assoc (or name "tamil-phonetic") quail-package-alist))
+ (tables (tamil--make-tables rules))
+ (v (nth 0 tables))
+ (v-trans (nth 1 tables))
+ (c (nth 2 tables))
+ (c-trans (nth 3 tables))
+ (m (nth 4 tables))
+ (m-trans (nth 5 tables))
+ (pulli (string #x0BCD)))
+ (clrhash (car tamil--hashtables))
+ (clrhash (cdr tamil--hashtables))
+ (indian--puthash-v v v-trans tamil--hashtables)
+ (indian--puthash-c c c-trans pulli tamil--hashtables)
+ (indian--puthash-cv c c-trans v v-trans tamil--hashtables)
+ (indian--puthash-m m m-trans tamil--hashtables)
+ ;; Now override the current translation rules.
+ ;; Empty quail map is '(list nil)'.
+ (setf (nth 2 quail-current-package) '(nil))
+ (maphash (lambda (k v)
+ (quail-defrule k (if (length= v 1)
+ (string-to-char v)
+ (vector v))))
+ (cdr tamil--hashtables))
+ (setq tamil--syllable-table
+ (quail-tamil-itrans-compute-syllable-table
+ (mapcar (lambda (ch) (append ch (pop v-trans))) v)
+ (mapcar (lambda (ch) (cons ch (pop c-trans))) c))
+ tamil--signs-table
+ (quail-tamil-itrans-compute-signs-table
+ nil
+ (append (mapcar (lambda (ch) (cons ch (pop m-trans))) m)
+ (and (gethash "ஸ்" (car tamil--hashtables))
+ `(("ஸ்ரீ" ,(concat (gethash "ஸ்" (car tamil--hashtables))
+ (gethash "ரீ" (car tamil--hashtables)))))))))))
+
+(defgroup tamil-input nil
+ "Translation rules for the Tamil input method."
+ :prefix "tamil-"
+ :group 'leim)
+
+(defcustom tamil-translation-rules
+ ;; Vowels.
+ '(("அ" "a") ("ஆ" "aa") ("இ" "i") ("ஈ" "ii")
+ ("உ" "u") ("ஊ" "uu") ("எ" "e") ("ஏ" "ee")
+ ("ஐ" "ai") ("ஒ" "o") ("ஓ" "oo") ("ஔ" "au" "ow")
+
+ ;; Consonants.
+ ("க்" "k" "g") ("ங்" "ng") ("ச்" "ch" "s") ("ஞ்" "nj") ("ட்" "t" "d")
+ ("ண்" "N") ("த்" "th" "dh") ("ந்" "nh") ("ப்" "p" "b") ("ம்" "m")
+ ("ய்" "y") ("ர்" "r") ("ல்" "l") ("வ்" "v") ("ழ்" "z" "zh")
+ ("ள்" "L") ("ற்" "rh") ("ன்" "n")
+ ;; Sanskrit.
+ ("ஜ்" "j") ("ஸ்" "S") ("ஷ்" "sh") ("ஹ்" "h")
+ ("க்ஷ்" "ksh") ("க்ஷ்" "ksH") ("ஶ்" "Z")
+
+ ;; Misc. ஃ is neither a consonant nor a vowel.
+ ("ஃ" "F" "q")
+ ("ௐ" "OM"))
+ "List of input sequences to translate to Tamil characters.
+Each element should be (CHARACTER INPUT-SEQUENCES...) where
+CHARACTER is the Tamil character, and INPUT-SEQUENCES is a list
+of input sequences which produce that character.
+
+CHARACTER is considered as a consonant (மெய் எழுத்து) if it ends
+with a pulli (virama).
+
+CHARACTER that is neither a vowel nor a consonant are inserted as
+is."
+ :group 'tamil-input
+ :type '(alist :key-type string :value-type (repeat string))
+ :set #'tamil--setter
+ :version "29.1"
+ :options
+ (delq nil
+ (append (mapcar #'car tamil--vowel-signs)
+ (mapcar (lambda (x) (if (characterp x)
+ (string x #x0BCD)
+ (and x (concat x "்"))))
+ (nth 1 indian-tml-base-table))
+ '("ஃ" "ௐ")
+ ;; Digits.
+ (mapcar #'string (nth 3 indian-tml-base-digits-table)))))
+
;;;
;;; Input by Inscript
;;;
--
2.35.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-10 7:32 ` Visuwesh
@ 2022-07-14 6:34 ` Eli Zaretskii
2022-07-14 7:11 ` Visuwesh
0 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2022-07-14 6:34 UTC (permalink / raw)
To: Visuwesh; +Cc: 56323-done
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sun, 10 Jul 2022 13:02:11 +0530
>
> > Updated patch attached.
> >
>
> I managed to miss a comment, sorry about that. Now fixed in attached
> patch.
Thanks, installed.
^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
2022-07-14 6:34 ` Eli Zaretskii
@ 2022-07-14 7:11 ` Visuwesh
0 siblings, 0 replies; 42+ messages in thread
From: Visuwesh @ 2022-07-14 7:11 UTC (permalink / raw)
To: 56323; +Cc: eliz
[வியாழன் ஜூலை 14, 2022] Eli Zaretskii wrote:
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 56323@debbugs.gnu.org
>> Date: Sun, 10 Jul 2022 13:02:11 +0530
>>
>> > Updated patch attached.
>> >
>>
>> I managed to miss a comment, sorry about that. Now fixed in attached
>> patch.
>
> Thanks, installed.
Thanks!
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2022-07-14 7:11 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-30 12:13 bug#56323: 29.0.50; Add new customisable phonetic Tamil input method Visuwesh
2022-06-30 14:08 ` Visuwesh
2022-06-30 15:53 ` Visuwesh
2022-07-01 12:59 ` bug#56323: 29.0.50; [v2] " Visuwesh
2022-07-01 13:01 ` Visuwesh
2022-07-01 13:22 ` Eli Zaretskii
2022-07-01 13:47 ` Visuwesh
2022-07-01 14:06 ` Eli Zaretskii
2022-07-01 14:30 ` Visuwesh
2022-07-01 16:09 ` Eli Zaretskii
2022-07-01 16:37 ` Visuwesh
2022-07-01 18:16 ` Eli Zaretskii
2022-07-02 4:02 ` Visuwesh
2022-07-02 6:35 ` Eli Zaretskii
2022-07-02 6:54 ` Visuwesh
2022-07-02 7:17 ` Eli Zaretskii
2022-07-02 7:35 ` Eli Zaretskii
2022-07-02 7:46 ` Eli Zaretskii
2022-07-02 8:11 ` Visuwesh
2022-07-02 8:29 ` Eli Zaretskii
2022-07-02 8:40 ` Visuwesh
2022-07-02 8:54 ` Eli Zaretskii
2022-07-02 9:33 ` Visuwesh
2022-07-02 9:38 ` Eli Zaretskii
2022-07-02 10:31 ` Visuwesh
2022-07-02 10:46 ` Eli Zaretskii
2022-07-02 12:08 ` Visuwesh
2022-07-02 11:05 ` समीर सिंह Sameer Singh
2022-07-02 12:04 ` Visuwesh
2022-07-02 12:23 ` Eli Zaretskii
2022-07-02 6:58 ` Eli Zaretskii
2022-07-02 7:58 ` Visuwesh
2022-07-02 8:39 ` Eli Zaretskii
2022-07-02 9:28 ` Visuwesh
2022-07-10 3:56 ` Visuwesh
2022-07-10 5:34 ` Eli Zaretskii
2022-07-10 6:42 ` Visuwesh
2022-07-10 7:32 ` Visuwesh
2022-07-14 6:34 ` Eli Zaretskii
2022-07-14 7:11 ` Visuwesh
2022-07-02 12:15 ` Visuwesh
2022-07-03 3:57 ` Visuwesh
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.