From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: James Thomas Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Improve Malayalam language transliteration Date: Mon, 27 Apr 2020 08:12:03 +0530 Message-ID: <87zhaxk5tw.fsf@gmx.net> References: <87d07ul5m1.fsf@gmx.net> <83r1wa5k1w.fsf@gnu.org> <87tv161aml.fsf@gmx.net> <83eesa5fxi.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="79623"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Apr 27 04:47:08 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jStnY-000Kbx-D6 for ged-emacs-devel@m.gmane-mx.org; Mon, 27 Apr 2020 04:47:08 +0200 Original-Received: from localhost ([::1]:51198 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jStnW-0001oN-1R for ged-emacs-devel@m.gmane-mx.org; Sun, 26 Apr 2020 22:47:06 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:35528) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jStil-0007dZ-Kv for emacs-devel@gnu.org; Sun, 26 Apr 2020 22:42:12 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jStik-00076N-LR for emacs-devel@gnu.org; Sun, 26 Apr 2020 22:42:11 -0400 Original-Received: from mout.gmx.net ([212.227.17.20]:44609) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jStij-000769-VE; Sun, 26 Apr 2020 22:42:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1587955327; bh=ADa45l+dzldJErfpX0LwKAcaVSNjbQDXl4jHsubFCY4=; h=X-UI-Sender-Class:From:To:Cc:Subject:In-Reply-To:References:Date; b=C0/+Ugq7nKxLXBwNPaB1USojBXj0hw1rbT/GEZka9o+7btAAr/CiR/v/5MREeyE4h jiLqY/i2LDzffAavWT2o8Z7BG/K5Ef8ijI91Opw9V4rp8eYa7YXqICdrGB80IlO5KE /l2R7/n0Bd9ZBVcQmMi1nvBM/EOjzr1OK/fvNmWc= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Original-Received: from localhost ([59.94.233.0]) by mail.gmx.com (mrgmx104 [212.227.17.174]) with ESMTPSA (Nemesis) id 1MHoNC-1jMtQW35Pi-00Erxq; Mon, 27 Apr 2020 04:42:07 +0200 In-Reply-To: <83eesa5fxi.fsf@gnu.org> X-Provags-ID: V03:K1:WK2U+DDcQHatB1dOb85rHnnnChHGWmxA1cmaraYepeKjMpq+QH/ dPtPeYSo+XYQOp8P8YzSZYR0or9vqg84yDgp8i6UTB9PzFHf3YdYmitlZzjM/C0Li3uXr8h 59/9tPVtdU59MFmUAlvR1xmAhzYbNLT0LDek2wxJ+amioHScozSw0Js2ePjvxgawUXnF3JO /PJKgbaPj+FxCCWPkQv3A== X-UI-Out-Filterresults: notjunk:1;V03:K0:z39DgObQ3Ds=:94DtiI3EDzODkzMvGhmetQ O7LUxv5FcDgndnaUflnyzeoF6iA2DzzwUNLm2hpY9y7uvcmpN3SUZwHlT6FWqGGpR7XcSTZyk mXnrsUiMDYPxNJgr2E18nPGmf87xAZ4nBNZ5A0zCKOm+05ilZdSbWqlJ6LAAmtEmsW1exdHfT Hu4FZCd1aLdPTtItM+qoT4tP0m7QGHBW00Kq+rwwWl6SI2GNPlbWD/5coidw7Pt/4Dv5ZfgT5 lHs4d7Xmwj7FYtofdqWfVYjX42S1NCtGOEVaIuUfqELpqEe5W1XC6mhCRfB2zgiN6Nqu0+IBe EGkOtX6orJNi+g5aZiZiHOmK+ke9Gm1r67GMq2WOrrphr9AsGxLYinKpQAhqyE3lMUhjNuqRn BvGSS+Z3sUKjutQFZvFahpLKrx9IHofec2Gk1goDE6op58HQlTIFm7SRv4dHhojydHven24yt o1emHs/jBsjUFvrKNRaLQHSYwvPoS/0RELNk8e6WQbtEYEC8gSQWiVpD0RU521w7x3EkHsQqH NeqtmtAT7cgVMupjMEejytXCbS3lYhv1oIYMCGOzo4KSdr3AL7sjrnIelUfE8tW+URceeJ8zk eID7aLQJJzj8GBst1BHmTlYOXx6Say2V/cqrW2FJ5irTV5FhKDlYNizPZS216ikN+2wzVZ840 cC78HwEy/OK9hfMV0PNGbicaHOU9ycnZbRFundAf449NkxrI+zENNdGpFGmKS+NdMsfS+M1/r QtLWFnbcJuTz1u9HfYjzJZTh87mnUO6CSDIZVWOCoW+ADpZw5Qz8GJ/kHpc/Ss/nxvU7vcVE Received-SPF: pass client-ip=212.227.17.20; envelope-from=jimjoe@gmx.net; helo=mout.gmx.net X-detected-operating-system: by eggs.gnu.org: First seen = 2020/04/26 22:42:08 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Received-From: 212.227.17.20 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:247885 Archived-At: --=-=-= Content-Type: text/plain Eli Zaretskii writes: > > That's okay, but we don't like deleting existing capabilities without > some period during which it's deprecated. So maybe say in NEWS that > ITRANS is deprecated, and add a comment in quail/indian.el that it > should be removed in some future release. > >> Sorry I missed the reference in lisp/language/indian.el. It seems to be >> just defining the default input method for the language. Can be changed >> easily if decided. > > If you think this new method is much better, it's okay to make it the > default, I think. Just mention that in NEWS. > > Thanks. > > P.S. And please keep the list address on the CC, so that this > discussion is recorded. Sorry for the bother, but this newer patch implements the Inscript stuff in an easier way. -- Jim --=-=-= Content-Type: text/x-diff; charset=utf-8 Content-Disposition: attachment; filename=0003-Improve-Malayalam-language-transliteration.patch Content-Transfer-Encoding: quoted-printable =46rom 0b655da7d98d6ff5c6211d1a56e879ac291f9c34 Mon Sep 17 00:00:00 2001 From: James Thomas Date: Mon, 27 Apr 2020 08:06:48 +0530 Subject: [PATCH] Improve Malayalam language transliteration The existing ITRANS scheme does not support some characters and language quirks like 'chillu's. The Inscript method has errors. * lisp/language/ind-util.el (indian-mlm-base-table): + archaic chars, Mozhi combos; cleanup. (indian-mlm-mozhi-table): For new scheme Mozhi. * lisp/leim/quail/indian.el (inscript-mlm-keytable): Correct errors. Add Inscript chillus & zero-width chars, Mozhi scheme. * etc/NEWS: Mention change Add a sufficient implementation of the Mozhi scheme. Complete Inscript implementation. =2D-- etc/NEWS | 7 +++ lisp/language/ind-util.el | 40 +++++++++++++++--- lisp/leim/quail/indian.el | 89 +++++++++++++++++++++++++++++++++++---- 3 files changed, 122 insertions(+), 14 deletions(-) diff --git a/etc/NEWS b/etc/NEWS index 025d5c14a7..aa551177d1 100644 =2D-- a/etc/NEWS +++ b/etc/NEWS @@ -288,6 +288,13 @@ prefix on the Subject line in various languages. These new navigation commands are bound to 'n' and 'p' in 'apropos-mode'. +** Quail + +--- +*** Improved Malayalam language transliteration +Added new Mozhi scheme. The inapplicable ITRANS scheme is now +deprecated. Errors in Inscript method corrected. + =0C * New Modes and Packages in Emacs 28.1 diff --git a/lisp/language/ind-util.el b/lisp/language/ind-util.el index 4319e5537e..62885227f1 100644 =2D-- a/lisp/language/ind-util.el +++ b/lisp/language/ind-util.el @@ -232,8 +232,8 @@ indian-mlm-base-table '( (;; VOWELS (?=E0=B4=85 nil) (?=E0=B4=86 ?=E0=B4=BE) (?=E0=B4=87 ?=E0=B4=BF) (?= =E0=B4=88 ?=E0=B5=80) (?=E0=B4=89 ?=E0=B5=81) (?=E0=B4=8A ?=E0=B5=82) - (?=E0=B4=8B ?=E0=B5=83) (?=E0=B4=8C nil) nil (?=E0=B4=8F ?=E0=B5=87)= (?=E0=B4=8E ?=E0=B5=86) (?=E0=B4=90 ?=E0=B5=88) - nil (?=E0=B4=93 ?=E0=B5=8B) (?=E0=B4=92 ?=E0=B5=8A) (?=E0=B4=94 ?=E0= =B5=8C) nil nil) + (?=E0=B4=8B ?=E0=B5=83) (?=E0=B4=8C ?=E0=B5=A2) (?=E0=B5=A1 ?=E0=B5= =A3) (?=E0=B4=8F ?=E0=B5=87) (?=E0=B4=8E ?=E0=B5=86) (?=E0=B4=90 ?=E0=B5= =88) + nil (?=E0=B4=92 ?=E0=B5=8A) (?=E0=B4=93 ?=E0=B5=8B) (?=E0=B4=94 ?=E0= =B5=97) (?=E0=B5=8D ?=E0=B5=8D) (?=E0=B5=A0 ?=E0=B5=84)) (;; CONSONANTS ?=E0=B4=95 ?=E0=B4=96 ?=E0=B4=97 ?=E0=B4=98 ?=E0=B4=99 = ;; GUTTRULS ?=E0=B4=9A ?=E0=B4=9B ?=E0=B4=9C ?=E0=B4=9D ?=E0=B4=9E = ;; PALATALS @@ -243,13 +243,16 @@ indian-mlm-base-table ?=E0=B4=AF ?=E0=B4=B0 ?=E0=B4=B1 ?=E0=B4=B2 ?=E0=B4=B3 ?=E0=B4=B4 ?= =E0=B4=B5 ;; SEMIVOWELS ?=E0=B4=B6 ?=E0=B4=B7 ?=E0=B4=B8 ?=E0=B4=B9 ;; SI= BILANTS nil nil nil nil nil nil nil nil ;; NUKTAS - "=E0=B4=9C=E0=B5=8D=E0=B4=9E" "=E0=B4=95=E0=B5=8D=E0=B4=B7") + "=E0=B4=9C=E0=B5=8D=E0=B4=9E" "=E0=B4=95=E0=B5=8D=E0=B4=B7" + "=E0=B4=B1=E0=B5=8D=E0=B4=B1" "=E0=B4=A8=E0=B5=8D=E0=B4=B1" "=E0=B4= =A4=E0=B5=8D=E0=B4=A4" "=E0=B4=A4=E0=B5=8D=E0=B4=A5" "=E0=B4=9E=E0=B5=8D= =E0=B4=9E" "=E0=B4=99=E0=B5=8D=E0=B4=99" "=E0=B4=A8=E0=B5=8D=E0=B4=A8" + "=E0=B4=9E=E0=B5=8D=E0=B4=9A" "=E0=B4=A8=E0=B5=8D=E0=B4=95" "=E0=B4= =99=E0=B5=8D=E0=B4=95" "=E0=B4=9A=E0=B5=8D=E0=B4=9A" "=E0=B4=9A=E0=B5=8D= =E0=B4=9B" "=E0=B4=95=E0=B5=8D=E0=B4=95" + "=E0=B4=AC=E0=B5=8D=E0=B4=AC" "=E0=B4=95=E0=B5=8D=E0=B4=95" "=E0=B4= =97=E0=B5=8D=E0=B4=97" "=E0=B4=9C=E0=B5=8D=E0=B4=9C" "=E0=B4=AE=E0=B5=8D= =E0=B4=AE" "=E0=B4=AA=E0=B5=8D=E0=B4=AA" "=E0=B4=B5=E0=B5=8D=E0=B4=B5" "= =E0=B4=95=E0=B5=8D=E0=B4=B8" "=E0=B4=B6=E0=B5=8D=E0=B4=B6") (;; Misc Symbols nil ?=E0=B4=82 ?=E0=B4=83 nil ?=E0=B5=8D nil nil) (;; Digits ?=E0=B5=A6 ?=E0=B5=A7 ?=E0=B5=A8 ?=E0=B5=A9 ?=E0=B5=AA ?=E0=B5=AB ?= =E0=B5=AC ?=E0=B5=AD ?=E0=B5=AE ?=E0=B5=AF) - (;; Inscript-extra (4) (#, $, ^, *, ]) - "=E0=B5=8D=E0=B4=B0" "=E0=B4=B0=E0=B5=8D" "=E0=B4=A4=E0=B5=8D=E0=B4= =B0" "=E0=B4=B6=E0=B5=8D=E0=B4=B0" nil))) + (;; Chillus + "=E0=B4=A3=E0=B5=8D" ?=E0=B5=BA "=E0=B4=A8=E0=B5=8D" ?=E0=B5=BB "=E0= =B4=B0=E0=B5=8D" ?=E0=B5=BC "=E0=B4=B2=E0=B5=8D" ?=E0=B5=BD "=E0=B4=B3=E0= =B5=8D" ?=E0=B5=BE))) (defvar indian-tml-base-table '( @@ -323,6 +326,29 @@ indian-itrans-v5-table-for-tamil (;; misc -- 7 ".N" (".n" "M") "H" ".a" ".h" ("AUM" "OM") ".."))) +(defvar indian-mlm-mozhi-table + '(;; for encode/decode + (;; vowels -- 18 + "a" ("aa" "A") "i" ("ii" "I") "u" ("uu" "U") + "R" "Ll" "Lll" ("E" "ae") "e" "ai" + nil "o" "O" "au" "~" "RR") + (;; consonants -- 40 + ("k" "c") "kh" "g" "gh" "ng" + "ch" ("Ch" "chh") "j" "jh" "nj" + "T" "Th" "D" "Dh" "N" + "th" "thh" "d" "dh" "n" nil + "p" ("ph" "f") "b" "bh" "m" + "y" "r" "rr" "l" "L" "zh" ("v" "w") + ("S" "z") "sh" "s" "h" + nil nil nil nil nil nil nil nil + nil "X" + ;; some of these are extra to Mozhi + ("t" "tt") "nt" "tth" "tthh" "nnj" "nng" "nn" + "nch" "nc" "nk" "cch" "cchh" "cc" + "B" ("C" "K" "q") "G" "J" "M" "P" "V" "x" "Z") + (;; misc -- 7 + nil nil "H"))) + (defvar indian-kyoto-harvard-table '(;; for encode/decode (;; vowel @@ -524,6 +550,10 @@ indian-mlm-itrans-v5-hash (indian-make-hash indian-mlm-base-table indian-itrans-v5-table)) +(defvar indian-mlm-mozhi-hash + (indian-make-hash indian-mlm-base-table + indian-mlm-mozhi-table)) + (defvar indian-tml-itrans-v5-hash (indian-make-hash indian-tml-base-table indian-itrans-v5-table-for-tamil)) diff --git a/lisp/leim/quail/indian.el b/lisp/leim/quail/indian.el index 2681eab0e5..100ae63f6a 100644 =2D-- a/lisp/leim/quail/indian.el +++ b/lisp/leim/quail/indian.el @@ -117,6 +117,7 @@ "\\''" indian-knd-itrans-v5-hash "kannada-itrans" "Kannada" "KndIT" "Kannada transliteration by ITRANS method.") +;; ITRANS not applicable to Malayalam & could be removed eventually (if nil (quail-define-package "malayalam-itrans" "Malayalam" "MlmIT" t "Malay= alam ITRANS")) (quail-define-indian-trans-package @@ -358,24 +359,23 @@ inscript-mlm-keytable '( (;; VOWELS (18) (?D nil) (?E ?e) (?F ?f) (?R ?r) (?G ?g) (?T ?t) - (?+ ?=3D) ("F]" "f]") (?! ?@) (?S ?s) (?Z ?z) (?W ?w) - (?| ?\\) (?~ ?`) (?A ?a) (?Q ?q) ("+]" "=3D]") ("R]" "r]")) + (?=3D ?+) nil nil (?S ?s) (?Z ?z) (?W ?w) + nil (?~ ?`) (?A ?a) (?Q ?q)) (;; CONSONANTS (42) ?k ?K ?i ?I ?U ;; GRUTTALS ?\; ?: ?p ?P ?} ;; PALATALS ?' ?\" ?\[ ?{ ?C ;; CEREBRALS - ?l ?L ?o ?O ?v ?V ;; DENTALS + ?l ?L ?o ?O ?v nil ;; DENTALS ?h ?H ?y ?Y ?c ;; LABIALS - ?/ ?j ?J ?n ?N "N]" ?b ;; SEMIVOWELS + ?/ ?j ?J ?n ?N ?B ?b ;; SEMIVOWELS ?M ?< ?m ?u ;; SIBILANTS - "k]" "K]" "i]" "p]" "[]" "{]" "H]" "/]" ;; NUKTAS - ?% ?&) + nil nil nil nil nil nil nil nil nil) ;; NUKTAS (;; Misc Symbols (7) - ?X ?x ?_ ">]" ?d "X]" ?>) + nil ?x ?_ nil ?d) (;; Digits ?0 ?1 ?2 ?3 ?4 ?5 ?6 ?7 ?8 ?9) - (;; Inscripts - ?# ?$ ?^ ?* ?\]))) + (;; Chillus + "Cd" "Cd]" "vd" "vd]" "jd" "jd]" "nd" "nd]" "Nd" "Nd]"))) (defvar inscript-tml-keytable '( @@ -463,6 +463,9 @@ inscript-tml-keytable "malayalam-inscript" "Malayalam" "MlmIS" "Malayalam keyboard Inscript.") +(quail-defrule "\\" ?=E2=80=8C) +(quail-defrule "X" ?=E2=80=8B) + (if nil (quail-define-package "tamil-inscript" "Tamil" "TmlIS" t "Tamil keybo= ard Inscript")) (quail-define-inscript-package @@ -571,4 +574,72 @@ inscript-tml-keytable ("?" ?\?) ("/" ?=E0=A7=8D)) +(defun indian-mlm-mozhi-update-translation (control-flag) + (let ((len (length quail-current-key)) chillu + (vowels '(?a ?e ?i ?o ?u ?A ?E ?I ?O ?U ?R))) + (cond ((numberp control-flag) + (progn (if (=3D control-flag 0) + (setq quail-current-str quail-current-key) + (cond (input-method-exit-on-first-char) + ((and (memq (aref quail-current-key + (1- control-flag)) + vowels) + (setq chillu (cl-position + (aref quail-current-key + control-flag) + '(?m ?N ?n ?r ?l ?L)))) + ;; conditions for putting chillu + (and (or (and (=3D control-flag (1- len)) + (not (setq control-flag nil))) + (and (=3D control-flag (- len 2)) + (let ((temp (aref quail-current-key + (1- len)))) + ;; is it last char of word? + (not + (or (and (>=3D temp ?a) (<=3D temp ?z)) + (and (>=3D temp ?A) (<=3D temp ?Z)) + (eq temp ?~)))) + (setq control-flag (1+ control-flag)))) + (setq quail-current-str ;; put chillu + (concat (if (not (stringp + quail-current-str)) + (string quail-current-str) + quail-current-str) + (string + (nth chillu '(?=E0=B4=82 ?=E0=B5=BA ?=E0=B5=BB ?=E0=B5=BC ?= =E0=B5=BD ?=E0=B5=BE))))))))) + (and (not input-method-exit-on-first-char) control-flag + (while (> len control-flag) + (setq len (1- len)) + (setq unread-command-events + (cons (aref quail-current-key len) + unread-command-events)))) + )) + ((null control-flag) + (unless quail-current-str + (setq quail-current-str quail-current-key) + )) + ((equal control-flag t) + (if (memq (aref quail-current-key (1- len)) ;; If vowel ending, + vowels) ;; may have to put + (setq control-flag nil))))) ;; chillu. So don't + control-flag) ;; end translatio= n + +(quail-define-package "malayalam-mozhi" "Malayalam" "MlmMI" t + "Malayalam transliteration by Mozhi method." + nil nil t nil nil nil t nil + 'indian-mlm-mozhi-update-translation) + +(maphash + (lambda (key val) + (quail-defrule key (if (=3D (length val) 1) + (string-to-char val) + (vector val)))) + (cdr indian-mlm-mozhi-hash)) + +(defun indian-mlm-mozhi-underscore (key len) (throw 'quail-tag nil)) + +(quail-defrule "_" 'indian-mlm-mozhi-underscore) +(quail-defrule "|" ?=E2=80=8C) +(quail-defrule "||" ?=E2=80=8B) + ;;; indian.el ends here =2D- 2.20.1 --=-=-=--