From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: James Thomas Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Improve Malayalam language transliteration Date: Mon, 27 Apr 2020 07:03:12 +0530 Message-ID: <87lfmhiug7.fsf@gmx.net> References: <87d07ul5m1.fsf@gmx.net> <83r1wa5k1w.fsf@gnu.org> <87tv161aml.fsf@gmx.net> <83eesa5fxi.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="72034"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Apr 27 03:38:29 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jSsj7-000Iet-Ln for ged-emacs-devel@m.gmane-mx.org; Mon, 27 Apr 2020 03:38:29 +0200 Original-Received: from localhost ([::1]:49998 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jSsj6-000721-G7 for ged-emacs-devel@m.gmane-mx.org; Sun, 26 Apr 2020 21:38:28 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:56368) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jSse9-00062l-8q for emacs-devel@gnu.org; Sun, 26 Apr 2020 21:33:21 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jSse8-0007vd-5l for emacs-devel@gnu.org; Sun, 26 Apr 2020 21:33:21 -0400 Original-Received: from mout.gmx.net ([212.227.17.21]:52495) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jSse7-0007sm-Ep; Sun, 26 Apr 2020 21:33:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1587951197; bh=e5aLdRFwZsd/neL1cAdCvXYzVf8qacvRMDE4eMbqHA8=; h=X-UI-Sender-Class:From:To:Cc:Subject:In-Reply-To:References:Date; b=Ij3n1aWJfumYhy5zSYBV3Anyr4q7NA/L661MS4Trcl2mZF3aSN/RmM+CnXBbOGygJ so/mYxwPXOI6zFPaizc5oqqaLHvqhy7sOXdCDaNpDFQOd0hOsbbhMV/SLkg3IzyL+W 4L/EzDB8GSNlEGqnaIGk2kf8NuE2OBpPUw+CBKw0= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Original-Received: from localhost ([59.94.232.130]) by mail.gmx.com (mrgmx105 [212.227.17.174]) with ESMTPSA (Nemesis) id 1N63VY-1j9FGW0Tpg-016Qq4; Mon, 27 Apr 2020 03:33:17 +0200 In-Reply-To: <83eesa5fxi.fsf@gnu.org> X-Provags-ID: V03:K1:HOxukIdX4lZRmjqcVQ+OOdqodc0TV8Yjg+ANkGybUILIljjLYHM +JAsFuuV14r5986JgI97Ce8MvsZx9JWlGJaBp9+v/rPZuzjuOKCtuYG0ISxIwk2nfb+h+Sr bZVAkTs6V8ndmhbMFq3qBKoKsEziEoPF/JlG9tyhBeUYVupuntr8YnuDP3DVjXLLbfR103V s/d/x6J02sBLBm+sk2NTQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:vH2PEjsAnHY=:ZXsO8YqPydhvbgE9dee5uv vq1fPz/x+eGWBTI3hxqZy5SlaRFgr+fzJwcvOZKOp7WBIxc6bzcPC90Pz+B8DptVMAGzi3rBI jOyrQ93awfOYQyKrXZbGTHxzcJWo0DbfR8p0w6zSLCmC+uZV6FBRLZP343jHl8aLhrnuq/he0 8dQjm+2/epQHYhy/QGwXs9jqTYP8eZinsNkgHuJJsqEBK+vL2JYaN2ALNSENgJGu0KOE/AqML 5Fi4WLIPWWDZloMxn3/swTS9sCrV7wxjVRWMu8d/csDLZNoXmghYrXpPRnvVE1H+wHmETM5q3 0k3yCVDdJCKKWna62OAO9mfWwZ8tUWg6BT5b9P/mCnDte2LFQaYyxLk5JMvZx2ZO0SJ0LyGu1 7CMu9s7vVuqwSVJ4dFb9/qHjlfC4qIQxZSpOMcZF01z7UIRJY5ddo6Kq/dRmSVwr4WaPZdITX IXcC26NGrEQq97rYJNPcwSSJBupiNUWso1oxh2SXTSD7Sq4OmgW5tyDEEXSH3OKk+LUeoactJ rNgymsavgqKtBXJ50YnP+THcALzNCjEuZxXBCouln61/VKiaC76nKq13n3Uhle0OfLEgh9SaT VF7g4urJkCmY/sThvMcTY5tv/Du7EIX8g0yA3+18d0Z5hLm31doROCpjowKbZn061aL5YmZdg 5sY7Zh3Lv2nhQnNVDoQBOpiiJs5Z4AQJYOY11mEGK0JMQiD+K46W5+J8tEA3BwkifXVEmFglt alujKPJt6NGfRhYU4T9jQB6DkPrZvkXLY41yTRJDxqNsOkW1qXbDkbhk7Lr+CRf/p0FZgxbC Received-SPF: pass client-ip=212.227.17.21; envelope-from=jimjoe@gmx.net; helo=mout.gmx.net X-detected-operating-system: by eggs.gnu.org: First seen = 2020/04/26 21:33:17 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] X-Received-From: 212.227.17.21 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:247873 Archived-At: --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Eli Zaretskii writes: >> From: James Thomas >> Date: Sun, 26 Apr 2020 21:51:06 +0530 >> > That's okay, but we don't like deleting existing capabilities without > some period during which it's deprecated. So maybe say in NEWS that > ITRANS is deprecated, and add a comment in quail/indian.el that it > should be removed in some future release. > >> Sorry I missed the reference in lisp/language/indian.el. It seems to be >> just defining the default input method for the language. Can be changed >> easily if decided. > > If you think this new method is much better, it's okay to make it the > default, I think. Just mention that in NEWS. > Here's a modified patch without ITRANS removal. I have refrained from making the new Mozhi method the default. =2D- Jim --=-=-= Content-Type: text/x-diff; charset=utf-8 Content-Disposition: attachment; filename=0002-Improve-Malayalam-language-transliteration.patch Content-Transfer-Encoding: quoted-printable =46rom 3e9bf54134bea2ccf30d1e174696a971d9f7de5a Mon Sep 17 00:00:00 2001 From: James Thomas Date: Mon, 27 Apr 2020 06:53:50 +0530 Subject: [PATCH] Improve Malayalam language transliteration The existing ITRANS scheme does not support some characters and language quirks like 'chillu's. The Inscript method has errors. * lisp/language/ind-util.el (indian-mlm-base-table): + archaic chars, Mozhi combos; cleanup. (indian-mlm-mozhi-table): For new scheme Mozhi. * lisp/leim/quail/indian.el (inscript-mlm-keytable): Correct errors. Add Inscript chillus & zero-width chars, Mozhi scheme. * etc/NEWS: Mention change. Add a sufficient implementation of the Mozhi scheme and complete Inscript scheme. =2D-- etc/NEWS | 7 +++ lisp/language/ind-util.el | 40 ++++++++++++--- lisp/leim/quail/indian.el | 101 ++++++++++++++++++++++++++++++++++---- 3 files changed, 132 insertions(+), 16 deletions(-) diff --git a/etc/NEWS b/etc/NEWS index 025d5c14a7..aa551177d1 100644 =2D-- a/etc/NEWS +++ b/etc/NEWS @@ -288,6 +288,13 @@ prefix on the Subject line in various languages. These new navigation commands are bound to 'n' and 'p' in 'apropos-mode'. +** Quail + +--- +*** Improved Malayalam language transliteration +Added new Mozhi scheme. The inapplicable ITRANS scheme is now +deprecated. Errors in Inscript method corrected. + =0C * New Modes and Packages in Emacs 28.1 diff --git a/lisp/language/ind-util.el b/lisp/language/ind-util.el index 4319e5537e..0c1f09c9b6 100644 =2D-- a/lisp/language/ind-util.el +++ b/lisp/language/ind-util.el @@ -232,8 +232,8 @@ indian-mlm-base-table '( (;; VOWELS (?=E0=B4=85 nil) (?=E0=B4=86 ?=E0=B4=BE) (?=E0=B4=87 ?=E0=B4=BF) (?= =E0=B4=88 ?=E0=B5=80) (?=E0=B4=89 ?=E0=B5=81) (?=E0=B4=8A ?=E0=B5=82) - (?=E0=B4=8B ?=E0=B5=83) (?=E0=B4=8C nil) nil (?=E0=B4=8F ?=E0=B5=87)= (?=E0=B4=8E ?=E0=B5=86) (?=E0=B4=90 ?=E0=B5=88) - nil (?=E0=B4=93 ?=E0=B5=8B) (?=E0=B4=92 ?=E0=B5=8A) (?=E0=B4=94 ?=E0= =B5=8C) nil nil) + (?=E0=B4=8B ?=E0=B5=83) (?=E0=B4=8C ?=E0=B5=A2) (?=E0=B5=A1 ?=E0=B5= =A3) (?=E0=B4=8F ?=E0=B5=87) (?=E0=B4=8E ?=E0=B5=86) (?=E0=B4=90 ?=E0=B5= =88) + nil (?=E0=B4=92 ?=E0=B5=8A) (?=E0=B4=93 ?=E0=B5=8B) (?=E0=B4=94 ?=E0= =B5=97) (?=E0=B5=8D ?=E0=B5=8D) (?=E0=B5=A0 ?=E0=B5=84)) (;; CONSONANTS ?=E0=B4=95 ?=E0=B4=96 ?=E0=B4=97 ?=E0=B4=98 ?=E0=B4=99 = ;; GUTTRULS ?=E0=B4=9A ?=E0=B4=9B ?=E0=B4=9C ?=E0=B4=9D ?=E0=B4=9E = ;; PALATALS @@ -243,13 +243,14 @@ indian-mlm-base-table ?=E0=B4=AF ?=E0=B4=B0 ?=E0=B4=B1 ?=E0=B4=B2 ?=E0=B4=B3 ?=E0=B4=B4 ?= =E0=B4=B5 ;; SEMIVOWELS ?=E0=B4=B6 ?=E0=B4=B7 ?=E0=B4=B8 ?=E0=B4=B9 ;; SI= BILANTS nil nil nil nil nil nil nil nil ;; NUKTAS - "=E0=B4=9C=E0=B5=8D=E0=B4=9E" "=E0=B4=95=E0=B5=8D=E0=B4=B7") + "=E0=B4=9C=E0=B5=8D=E0=B4=9E" "=E0=B4=95=E0=B5=8D=E0=B4=B7" + "=E0=B4=B1=E0=B5=8D=E0=B4=B1" "=E0=B4=A8=E0=B5=8D=E0=B4=B1" "=E0=B4= =A4=E0=B5=8D=E0=B4=A4" "=E0=B4=A4=E0=B5=8D=E0=B4=A5" "=E0=B4=9E=E0=B5=8D= =E0=B4=9E" "=E0=B4=99=E0=B5=8D=E0=B4=99" "=E0=B4=A8=E0=B5=8D=E0=B4=A8" + "=E0=B4=9E=E0=B5=8D=E0=B4=9A" "=E0=B4=A8=E0=B5=8D=E0=B4=95" "=E0=B4= =99=E0=B5=8D=E0=B4=95" "=E0=B4=9A=E0=B5=8D=E0=B4=9A" "=E0=B4=9A=E0=B5=8D= =E0=B4=9B" "=E0=B4=95=E0=B5=8D=E0=B4=95" + "=E0=B4=AC=E0=B5=8D=E0=B4=AC" "=E0=B4=95=E0=B5=8D=E0=B4=95" "=E0=B4= =97=E0=B5=8D=E0=B4=97" "=E0=B4=9C=E0=B5=8D=E0=B4=9C" "=E0=B4=AE=E0=B5=8D= =E0=B4=AE" "=E0=B4=AA=E0=B5=8D=E0=B4=AA" "=E0=B4=B5=E0=B5=8D=E0=B4=B5" "= =E0=B4=95=E0=B5=8D=E0=B4=B8" "=E0=B4=B6=E0=B5=8D=E0=B4=B6") (;; Misc Symbols nil ?=E0=B4=82 ?=E0=B4=83 nil ?=E0=B5=8D nil nil) (;; Digits - ?=E0=B5=A6 ?=E0=B5=A7 ?=E0=B5=A8 ?=E0=B5=A9 ?=E0=B5=AA ?=E0=B5=AB ?= =E0=B5=AC ?=E0=B5=AD ?=E0=B5=AE ?=E0=B5=AF) - (;; Inscript-extra (4) (#, $, ^, *, ]) - "=E0=B5=8D=E0=B4=B0" "=E0=B4=B0=E0=B5=8D" "=E0=B4=A4=E0=B5=8D=E0=B4= =B0" "=E0=B4=B6=E0=B5=8D=E0=B4=B0" nil))) + ?=E0=B5=A6 ?=E0=B5=A7 ?=E0=B5=A8 ?=E0=B5=A9 ?=E0=B5=AA ?=E0=B5=AB ?= =E0=B5=AC ?=E0=B5=AD ?=E0=B5=AE ?=E0=B5=AF))) (defvar indian-tml-base-table '( @@ -323,6 +324,29 @@ indian-itrans-v5-table-for-tamil (;; misc -- 7 ".N" (".n" "M") "H" ".a" ".h" ("AUM" "OM") ".."))) +(defvar indian-mlm-mozhi-table + '(;; for encode/decode + (;; vowels -- 18 + "a" ("aa" "A") "i" ("ii" "I") "u" ("uu" "U") + "R" "Ll" "Lll" ("E" "ae") "e" "ai" + nil "o" "O" "au" "~" "RR") + (;; consonants -- 40 + ("k" "c") "kh" "g" "gh" "ng" + "ch" ("Ch" "chh") "j" "jh" "nj" + "T" "Th" "D" "Dh" "N" + "th" "thh" "d" "dh" "n" nil + "p" ("ph" "f") "b" "bh" "m" + "y" "r" "rr" "l" "L" "zh" ("v" "w") + ("S" "z") "sh" "s" "h" + nil nil nil nil nil nil nil nil + nil "X" + ;; some of these are extra to Mozhi + ("t" "tt") "nt" "tth" "tthh" "nnj" "nng" "nn" + "nch" "nc" "nk" "cch" "cchh" "cc" + "B" ("C" "K" "q") "G" "J" "M" "P" "V" "x" "Z") + (;; misc -- 7 + nil nil "H"))) + (defvar indian-kyoto-harvard-table '(;; for encode/decode (;; vowel @@ -524,6 +548,10 @@ indian-mlm-itrans-v5-hash (indian-make-hash indian-mlm-base-table indian-itrans-v5-table)) +(defvar indian-mlm-mozhi-hash + (indian-make-hash indian-mlm-base-table + indian-mlm-mozhi-table)) + (defvar indian-tml-itrans-v5-hash (indian-make-hash indian-tml-base-table indian-itrans-v5-table-for-tamil)) diff --git a/lisp/leim/quail/indian.el b/lisp/leim/quail/indian.el index 2681eab0e5..9724d2d4a6 100644 =2D-- a/lisp/leim/quail/indian.el +++ b/lisp/leim/quail/indian.el @@ -117,6 +117,7 @@ "\\''" indian-knd-itrans-v5-hash "kannada-itrans" "Kannada" "KndIT" "Kannada transliteration by ITRANS method.") +;; ITRANS not applicable to Malayalam & could be removed eventually (if nil (quail-define-package "malayalam-itrans" "Malayalam" "MlmIT" t "Malay= alam ITRANS")) (quail-define-indian-trans-package @@ -358,24 +359,21 @@ inscript-mlm-keytable '( (;; VOWELS (18) (?D nil) (?E ?e) (?F ?f) (?R ?r) (?G ?g) (?T ?t) - (?+ ?=3D) ("F]" "f]") (?! ?@) (?S ?s) (?Z ?z) (?W ?w) - (?| ?\\) (?~ ?`) (?A ?a) (?Q ?q) ("+]" "=3D]") ("R]" "r]")) + (?=3D ?+) nil nil (?S ?s) (?Z ?z) (?W ?w) + nil (?~ ?`) (?A ?a) (?Q ?q)) (;; CONSONANTS (42) ?k ?K ?i ?I ?U ;; GRUTTALS ?\; ?: ?p ?P ?} ;; PALATALS ?' ?\" ?\[ ?{ ?C ;; CEREBRALS - ?l ?L ?o ?O ?v ?V ;; DENTALS + ?l ?L ?o ?O ?v nil ;; DENTALS ?h ?H ?y ?Y ?c ;; LABIALS - ?/ ?j ?J ?n ?N "N]" ?b ;; SEMIVOWELS + ?/ ?j ?J ?n ?N ?B ?b ;; SEMIVOWELS ?M ?< ?m ?u ;; SIBILANTS - "k]" "K]" "i]" "p]" "[]" "{]" "H]" "/]" ;; NUKTAS - ?% ?&) + nil nil nil nil nil nil nil nil nil) ;; NUKTAS (;; Misc Symbols (7) - ?X ?x ?_ ">]" ?d "X]" ?>) + nil ?x ?_ nil ?d) (;; Digits - ?0 ?1 ?2 ?3 ?4 ?5 ?6 ?7 ?8 ?9) - (;; Inscripts - ?# ?$ ?^ ?* ?\]))) + ?0 ?1 ?2 ?3 ?4 ?5 ?6 ?7 ?8 ?9))) (defvar inscript-tml-keytable '( @@ -463,6 +461,21 @@ inscript-tml-keytable "malayalam-inscript" "Malayalam" "MlmIS" "Malayalam keyboard Inscript.") +;; Chillus +(quail-defrule "Cd" ["=E0=B4=A3=E0=B5=8D"]) +(quail-defrule "Cd]" ?=E0=B5=BA) +(quail-defrule "vd" ["=E0=B4=A8=E0=B5=8D"]) +(quail-defrule "vd]" ?=E0=B5=BB) +(quail-defrule "jd" ["=E0=B4=B0=E0=B5=8D"]) +(quail-defrule "jd]" ?=E0=B5=BC) +(quail-defrule "nd" ["=E0=B4=B2=E0=B5=8D"]) +(quail-defrule "nd]" ?=E0=B5=BD) +(quail-defrule "Nd" ["=E0=B4=B3=E0=B5=8D"]) +(quail-defrule "Nd]" ?=E0=B5=BE) + +(quail-defrule "\\" ?=E2=80=8C) +(quail-defrule "X" ?=E2=80=8B) + (if nil (quail-define-package "tamil-inscript" "Tamil" "TmlIS" t "Tamil keybo= ard Inscript")) (quail-define-inscript-package @@ -571,4 +584,72 @@ inscript-tml-keytable ("?" ?\?) ("/" ?=E0=A7=8D)) +(defun indian-mlm-mozhi-update-translation (control-flag) + (let ((len (length quail-current-key)) chillu + (vowels '(?a ?e ?i ?o ?u ?A ?E ?I ?O ?U ?R))) + (cond ((numberp control-flag) + (progn (if (=3D control-flag 0) + (setq quail-current-str quail-current-key) + (cond (input-method-exit-on-first-char) + ((and (memq (aref quail-current-key + (1- control-flag)) + vowels) + (setq chillu (cl-position + (aref quail-current-key + control-flag) + '(?m ?N ?n ?r ?l ?L)))) + ;; conditions for putting chillu + (and (or (and (=3D control-flag (1- len)) + (not (setq control-flag nil))) + (and (=3D control-flag (- len 2)) + (let ((temp (aref quail-current-key + (1- len)))) + ;; is it last char of word? + (not + (or (and (>=3D temp ?a) (<=3D temp ?z)) + (and (>=3D temp ?A) (<=3D temp ?Z)) + (eq temp ?~)))) + (setq control-flag (1+ control-flag)))) + (setq quail-current-str ;; put chillu + (concat (if (not (stringp + quail-current-str)) + (string quail-current-str) + quail-current-str) + (string + (nth chillu '(?=E0=B4=82 ?=E0=B5=BA ?=E0=B5=BB ?=E0=B5=BC ?= =E0=B5=BD ?=E0=B5=BE))))))))) + (and (not input-method-exit-on-first-char) control-flag + (while (> len control-flag) + (setq len (1- len)) + (setq unread-command-events + (cons (aref quail-current-key len) + unread-command-events)))) + )) + ((null control-flag) + (unless quail-current-str + (setq quail-current-str quail-current-key) + )) + ((equal control-flag t) + (if (memq (aref quail-current-key (1- len)) ;; If vowel ending, + vowels) ;; may have to put + (setq control-flag nil))))) ;; chillu. So don't + control-flag) ;; end translatio= n + +(quail-define-package "malayalam-mozhi" "Malayalam" "MlmMI" t + "Malayalam transliteration by Mozhi method." + nil nil t nil nil nil t nil + 'indian-mlm-mozhi-update-translation) + +(maphash + (lambda (key val) + (quail-defrule key (if (=3D (length val) 1) + (string-to-char val) + (vector val)))) + (cdr indian-mlm-mozhi-hash)) + +(defun indian-mlm-mozhi-underscore (key len) (throw 'quail-tag nil)) + +(quail-defrule "_" 'indian-mlm-mozhi-underscore) +(quail-defrule "|" ?=E2=80=8C) +(quail-defrule "||" ?=E2=80=8B) + ;;; indian.el ends here =2D- 2.20.1 --=-=-=--