* bug#55694: [PATCH] Add support for the Batak scripts @ 2022-05-29 0:51 समीर सिंह Sameer Singh 2022-05-29 7:13 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: समीर सिंह Sameer Singh @ 2022-05-29 0:51 UTC (permalink / raw) To: 55694 [-- Attachment #1.1: Type: text/plain, Size: 278 bytes --] This time the Batak scripts are added to Emacs. Since the Batak scripts are actually a collection of five scripts: Toba, Karo, Pakpak, Mandailing, and Simalungun. I have provided 5 different language environments and input-methods for them. Please review the patch. Thank you. [-- Attachment #1.2: Type: text/html, Size: 379 bytes --] [-- Attachment #2: 0001-Add-support-for-the-Batak-scripts.patch --] [-- Type: text/x-patch, Size: 9966 bytes --] From 6ed29ccbfd4e66c5126b8bfe1f12666e030d1629 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0=20=E0=A4=B8=E0=A4=BF?= =?UTF-8?q?=E0=A4=82=E0=A4=B9=20Sameer=20Singh?= <lumarzeli30@gmail.com> Date: Sat, 28 May 2022 06:46:55 +0530 Subject: [PATCH] Add support for the Batak scripts * lisp/language/indonesian.el ("Batak Karo") ("Batak Toba") ("Batak Pakpak") ("Batak Mandailing") ("Batak Simalungun"): New language environments. Add composition rules for Batak. Add sample texts and input methods. * lisp/international/fontset.el (script-representative-chars) (setup-default-fontset): Support Batak. * lisp/leim/quail/indonesian.el ("batak-karo") ("batak-toba") ("batak-pakpak") ("batak-mandailing") ("batak-simalungun") : New input methods. * etc/HELLO: Add a Batak greeting. * etc/NEWS: Announce the new language environments and their input methods. --- etc/HELLO | 1 + etc/NEWS | 5 + lisp/international/fontset.el | 2 + lisp/language/indonesian.el | 59 ++++++++++ lisp/leim/quail/indonesian.el | 199 ++++++++++++++++++++++++++++++++++ 5 files changed, 266 insertions(+) diff --git a/etc/HELLO b/etc/HELLO index 4ff066847d..c89f8943c7 100644 --- a/etc/HELLO +++ b/etc/HELLO @@ -28,6 +28,7 @@ Amharic (አማርኛ) ሠላም Arabic (العربيّة) السّلام عليكم Armenian (հայերեն) Բարև ձեզ Balinese (ᬅᬓ᭄ᬱᬭᬩᬮᬶ) ᬒᬁᬲ᭄ᬯᬲ᭄ᬢ᭄ᬬᬲ᭄ᬢᬸ +Batak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲) ᯂᯬᯒᯘ᯲ Belarusian (беларуская) Прывітанне Bengali (বাংলা) নমস্কার Brahmi (𑀩𑁆𑀭𑀸𑀳𑁆𑀫𑀻) 𑀦𑀫𑀲𑁆𑀢𑁂 diff --git a/etc/NEWS b/etc/NEWS index 97a04405f5..601f103b40 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -826,6 +826,11 @@ corresponding language environments are: **** Balinese script and language environment **** Javanese script and language environment **** Sundanese script and language environment +**** Batak Karo script and language environment +**** Batak Toba script and language environment +**** Batak Pakpak script and language environment +**** Batak Mandailing script and language environment +**** Batak Simalungun script and language environment --- *** The "Oriya" language environment was renamed to "Odia". diff --git a/lisp/international/fontset.el b/lisp/international/fontset.el index 00ee0cf475..bf4b9b578e 100644 --- a/lisp/international/fontset.el +++ b/lisp/international/fontset.el @@ -191,6 +191,7 @@ font-encoding-charset-alist (limbu #x1901 #x1920 #x1936) (balinese #x1B13 #x1B35 #x1B5E) (sundanese #x1B8A #x1BAB #x1CC4) + (batak #x1BC2 #x1BE7 #x1BFF) (tai-le #x1950) (tai-lue #x1980) (tai-tham #x1A20 #x1A55 #x1A61 #x1A80) @@ -762,6 +763,7 @@ setup-default-fontset limbu balinese sundanese + batak symbol braille yi diff --git a/lisp/language/indonesian.el b/lisp/language/indonesian.el index 4bdcd0a49c..aad8c534cb 100644 --- a/lisp/language/indonesian.el +++ b/lisp/language/indonesian.el @@ -54,6 +54,51 @@ (documentation . "\ Sundanese language and its script are supported in this language environment."))) +(set-language-info-alist + "Batak Karo" '((charset unicode) + (coding-system utf-8) + (coding-priority utf-8) + (input-method . "batak-karo") + (sample-text . "Batak Karo (ᯘᯬᯒᯗ᯳ᯆᯗᯂ᯳) ᯔᯧᯐᯬᯀᯱᯐᯬᯀᯱ") + (documentation . "\ +Batak Karo language and its script are supported in this language environment."))) + +(set-language-info-alist + "Batak Toba" '((charset unicode) + (coding-system utf-8) + (coding-priority utf-8) + (input-method . "batak-toba") + (sample-text . "Batak Toba (ᯘᯮᯮᯒᯖ᯲ᯅᯖᯂ᯲) ᯂᯬᯒᯘ᯲") + (documentation . "\ +Batak Toba language and its script are supported in this language environment."))) + +(set-language-info-alist + "Batak Pakpak" '((charset unicode) + (coding-system utf-8) + (coding-priority utf-8) + (input-method . "batak-pakpak") + (sample-text . "Batak Pakpak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲) ᯂᯬᯒᯘ᯲") + (documentation . "\ +Batak Pakpak language and its script are supported in this language environment."))) + +(set-language-info-alist + "Batak Mandailing" '((charset unicode) + (coding-system utf-8) + (coding-priority utf-8) + (input-method . "batak-mandailing") + (sample-text . "Batak Mandailing (ᯚᯮᯒᯖ᯲ᯅᯖᯄᯱ᯲) ᯄᯬᯒᯚ᯲") + (documentation . "\ +Batak Mandailing language and its script are supported in this language environment."))) + +(set-language-info-alist + "Batak Simalungun" '((charset unicode) + (coding-system utf-8) + (coding-priority utf-8) + (input-method . "batak-simalungun") + (sample-text . "Batak Simalungun (ᯙᯮᯮᯓᯖ᯳ᯅᯖᯃ᯳) ᯃᯬᯓᯙ᯲") + (documentation . "\ +Batak Simalungun language and its script are supported in this language environment."))) + ;; Balinese composition rules (let ((consonant "[\x1B13-\x1B33\x1B45-\x1B4B]") (independent-vowel "[\x1B05-\x1B12]") @@ -119,5 +164,19 @@ vowel "?" modifier-above "?" dependant-consonant "?") 1 'font-shape-gstring)))) +;; Batak composition rules +(let ((akshara "[\x1BC0-\x1BE5]") + (vowel "[\x1BE7-\x1BEF]") + (dependant-consonant "[\x1BF0\x1BF1]") + (modifier-above "\x1BE6") + (virama "[\x1BF2\x1BF3]")) + (set-char-table-range composition-function-table + '(#x1BE6 . #x1BF3) + (list (vector + ;; Akshara based syllables + (concat akshara virama "?" vowel "*" modifier-above + "?" dependant-consonant "?") + 1 'font-shape-gstring)))) + (provide 'indonesian) ;;; indonesian.el ends here diff --git a/lisp/leim/quail/indonesian.el b/lisp/leim/quail/indonesian.el index dd931e9879..3033a48e4f 100644 --- a/lisp/leim/quail/indonesian.el +++ b/lisp/leim/quail/indonesian.el @@ -377,5 +377,204 @@ ("`m" ?ᮿ) ("`M" ?ᮬ)) +(quail-define-package + "batak-karo" "Batak Karo" "ᯂᯒᯭ" nil "Batak Karo phonetic input method." + nil t t t t nil nil nil nil nil t) + +(quail-define-rules + ("q" ?᯼) + ("Q" ?᯽) + ("w" ?ᯋ) + ("e" ?ᯧ) + ("E" ?ᯩ) + ("r" ?ᯒ) + ("t" ?ᯗ) + ("y" ?ᯛ) + ("u" ?ᯮ) + ("U" ?ᯥ) + ("i" ?ᯫ) + ("I" ?ᯤ) + ("o" ?ᯭ) + ("p" ?ᯇ) + ("a" ?ᯀ) + ("s" ?ᯘ) + ("d" ?ᯑ) + ("f" ?᯲) + ("F" ?᯳) + ("g" ?ᯎ) + ("h" ?ᯱ) + ("j" ?ᯐ) + ("k" ?ᯂ) + ("l" ?ᯞ) + ("z" ?ᯝ) + ("Z" ?ᯰ) + ("c" ?ᯠ) + ("C" ?ᯡ) + ("v" ?᯾) + ("V" ?᯿) + ("b" ?ᯆ) + ("n" ?ᯉ) + ("N" ?ᯢ) + ("m" ?ᯔ) + ("M" ?ᯣ)) + +(quail-define-package + "batak-mandailing" "Batak Mandailing" "ᯔᯊᯑᯤᯞᯪᯰ" nil "Batak Mandailing phonetic input method." + nil t t t t nil nil nil nil nil t) + +(quail-define-rules + ("q" ?᯼) + ("Q" ?᯽) + ("w" ?ᯋ) + ("e" ?ᯧ) + ("E" ?ᯩ) + ("r" ?ᯒ) + ("t" ?ᯖ) + ("y" ?ᯛ) + ("u" ?ᯮ) + ("U" ?ᯥ) + ("i" ?ᯪ) + ("I" ?ᯤ) + ("o" ?ᯬ) + ("p" ?ᯇ) + ("a" ?ᯀ) + ("s" ?ᯚ) + ("d" ?ᯑ) + ("f" ?᯲) + ("F" ?᯳) + ("g" ?ᯎ) + ("h" ?ᯄ) + ("H" ?ᯱ) + ("j" ?ᯐ) + ("l" ?ᯞ) + ("z" ?ᯝ) + ("Z" ?ᯰ) + ("x" ?᯦) + ("v" ?᯾) + ("V" ?᯿) + ("b" ?ᯅ) + ("n" ?ᯊ) + ("N" ?ᯠ) + ("m" ?ᯔ)) + +(quail-define-package + "batak-pakpak" "Batak Pakpak" "ᯇᯂ᯲ᯇᯂ᯲" nil "Batak Pakpak phonetic input method." + nil t t t t nil nil nil nil nil t) + +(quail-define-rules + ("q" ?᯼) + ("Q" ?᯽) + ("w" ?ᯍ) + ("e" ?ᯨ) + ("E" ?ᯩ) + ("r" ?ᯒ) + ("t" ?ᯗ) + ("y" ?ᯛ) + ("u" ?ᯮ) + ("U" ?ᯥ) + ("i" ?ᯪ) + ("I" ?ᯤ) + ("o" ?ᯬ) + ("p" ?ᯇ) + ("a" ?ᯀ) + ("s" ?ᯘ) + ("d" ?ᯑ) + ("f" ?᯲) + ("F" ?᯳) + ("g" ?ᯎ) + ("h" ?ᯱ) + ("j" ?ᯐ) + ("k" ?ᯂ) + ("l" ?ᯞ) + ("z" ?ᯝ) + ("Z" ?ᯰ) + ("c" ?ᯡ) + ("b" ?ᯅ) + ("v" ?᯾) + ("V" ?᯿) + ("n" ?ᯉ) + ("N" ?ᯠ) + ("m" ?ᯔ)) + +(quail-define-package + "batak-toba" "Batak Toba" "ᯖᯬᯅ" nil "Batak Toba phonetic input method." + nil t t t t nil nil nil nil nil t) + +(quail-define-rules + ("q" ?᯼) + ("Q" ?᯽) + ("w" ?ᯋ) + ("W" ?ᯍ) + ("e" ?ᯧ) + ("E" ?ᯩ) + ("r" ?ᯒ) + ("t" ?ᯖ) + ("T" ?ᯗ) + ("y" ?ᯛ) + ("u" ?ᯮ) + ("U" ?ᯥ) + ("i" ?ᯪ) + ("I" ?ᯤ) + ("o" ?ᯬ) + ("p" ?ᯇ) + ("a" ?ᯀ) + ("s" ?ᯘ) + ("d" ?ᯑ) + ("f" ?᯲) + ("F" ?᯳) + ("g" ?ᯎ) + ("h" ?ᯂ) + ("j" ?ᯐ) + ("l" ?ᯞ) + ("z" ?ᯝ) + ("Z" ?ᯰ) + ("c" ?ᯡ) + ("v" ?᯾) + ("V" ?᯿) + ("b" ?ᯅ) + ("n" ?ᯉ) + ("N" ?ᯠ) + ("m" ?ᯔ)) + +(quail-define-package + "batak-simalungun" "Batak Simalungun" "ᯙᯫᯕᯟᯮᯝᯮᯉ᯳" nil "Batak Simalungun phonetic input method." + nil t t t t nil nil nil nil nil t) + +(quail-define-rules + ("q" ?᯼) + ("Q" ?᯽) + ("w" ?ᯌ) + ("e" ?ᯧ) + ("E" ?ᯩ) + ("r" ?ᯓ) + ("t" ?ᯖ) + ("y" ?ᯜ) + ("u" ?ᯮ) + ("U" ?ᯥ) + ("i" ?ᯪ) + ("I" ?ᯤ) + ("o" ?ᯬ) + ("p" ?ᯈ) + ("a" ?ᯁ) + ("s" ?ᯙ) + ("S" ?ᯯ) + ("d" ?ᯑ) + ("f" ?᯲) + ("F" ?᯳) + ("g" ?ᯏ) + ("h" ?ᯃ) + ("H" ?ᯱ) + ("j" ?ᯐ) + ("l" ?ᯟ) + ("z" ?ᯝ) + ("Z" ?ᯰ) + ("c" ?ᯡ) + ("v" ?᯾) + ("V" ?᯿) + ("b" ?ᯅ) + ("n" ?ᯉ) + ("N" ?ᯠ) + ("m" ?ᯕ)) + (provide 'indonesian) ;;; indonesian.el ends here -- 2.36.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* bug#55694: [PATCH] Add support for the Batak scripts 2022-05-29 0:51 bug#55694: [PATCH] Add support for the Batak scripts समीर सिंह Sameer Singh @ 2022-05-29 7:13 ` Eli Zaretskii 2022-05-29 11:44 ` समीर सिंह Sameer Singh 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2022-05-29 7:13 UTC (permalink / raw) To: समीर सिंह Sameer Singh Cc: 55694 > From: समीर सिंह Sameer Singh > <lumarzeli30@gmail.com> > Date: Sun, 29 May 2022 06:21:33 +0530 > > This time the Batak scripts are added to Emacs. > Since the Batak scripts are actually a collection of five scripts: > Toba, Karo, Pakpak, Mandailing, and Simalungun. I think the above are _languages_, not scripts. They all used in the past to use the Batak script for writing, but they aren't scripts. > I have provided 5 different language environments and input-methods for them. > > Please review the patch. > --- a/etc/HELLO > +++ b/etc/HELLO > @@ -28,6 +28,7 @@ Amharic (አማርኛ) ሠላም > Arabic (العربيّة) السّلام عليكم > Armenian (հայերեն) Բարև ձեզ > Balinese (ᬅᬓ᭄ᬱᬭᬩᬮᬶ) ᬒᬁᬲ᭄ᬯᬲ᭄ᬢ᭄ᬬᬲ᭄ᬢᬸ > +Batak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲) ᯂᯬᯒᯘ᯲ Is this greeting common to all the languages using the Batak script? I don't think so. So perhaps we should have several greetings, one for each language? > --- a/etc/NEWS > +++ b/etc/NEWS > @@ -826,6 +826,11 @@ corresponding language environments are: > **** Balinese script and language environment > **** Javanese script and language environment > **** Sundanese script and language environment > +**** Batak Karo script and language environment > +**** Batak Toba script and language environment > +**** Batak Pakpak script and language environment > +**** Batak Mandailing script and language environment > +**** Batak Simalungun script and language environment See above: this should distinguish between the script name and the language names. Something like **** Karo language using the Batak script and its language environment > +(set-language-info-alist > + "Batak Karo" '((charset unicode) > + (coding-system utf-8) > + (coding-priority utf-8) > + (input-method . "batak-karo") > + (sample-text . "Batak Karo (ᯘᯬᯒᯗ᯳ᯆᯗᯂ᯳) ᯔᯧᯐᯬᯀᯱᯐᯬᯀᯱ") > + (documentation . "\ > +Batak Karo language and its script are supported in this language environment."))) Likewise here. The doc string should say something like Karo language using the Batak script is supported in this language environment. > + > +(set-language-info-alist > + "Batak Toba" '((charset unicode) > + (coding-system utf-8) > + (coding-priority utf-8) > + (input-method . "batak-toba") > + (sample-text . "Batak Toba (ᯘᯮᯮᯒᯖ᯲ᯅᯖᯂ᯲) ᯂᯬᯒᯘ᯲") > + (documentation . "\ > +Batak Toba language and its script are supported in this language environment."))) > + > +(set-language-info-alist > + "Batak Pakpak" '((charset unicode) > + (coding-system utf-8) > + (coding-priority utf-8) > + (input-method . "batak-pakpak") > + (sample-text . "Batak Pakpak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲) ᯂᯬᯒᯘ᯲") > + (documentation . "\ > +Batak Pakpak language and its script are supported in this language environment."))) > + > +(set-language-info-alist > + "Batak Mandailing" '((charset unicode) > + (coding-system utf-8) > + (coding-priority utf-8) > + (input-method . "batak-mandailing") > + (sample-text . "Batak Mandailing (ᯚᯮᯒᯖ᯲ᯅᯖᯄᯱ᯲) ᯄᯬᯒᯚ᯲") > + (documentation . "\ > +Batak Mandailing language and its script are supported in this language environment."))) > + > +(set-language-info-alist > + "Batak Simalungun" '((charset unicode) > + (coding-system utf-8) > + (coding-priority utf-8) > + (input-method . "batak-simalungun") > + (sample-text . "Batak Simalungun (ᯙᯮᯮᯓᯖ᯳ᯅᯖᯃ᯳) ᯃᯬᯓᯙ᯲") > + (documentation . "\ > +Batak Simalungun language and its script are supported in this language environment."))) > + Btw, according to this article: https://en.wikipedia.org/wiki/Batak_languages there are 2 more languages that used Batak; why aren't they included? > +(quail-define-package > + "batak-karo" "Batak Karo" "ᯂᯒᯭ" nil "Batak Karo phonetic input method." > + nil t t t t nil nil nil nil nil t) > + > +(quail-define-rules The input methods look almost identical, with a few minor deviations. Are the differences real or are they mistakes? If they are mistakes, we can have just one input method for all the languages using Batak. And if the differences are real, can we still have only one input method, where the different variants of the same ASCII letter are selected by the user at typing time? It seems un-economical to have so many input methods that are almost identical. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#55694: [PATCH] Add support for the Batak scripts 2022-05-29 7:13 ` Eli Zaretskii @ 2022-05-29 11:44 ` समीर सिंह Sameer Singh 2022-05-29 12:07 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: समीर सिंह Sameer Singh @ 2022-05-29 11:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 55694 [-- Attachment #1: Type: text/plain, Size: 8372 bytes --] Thank you for the feedback! > From: समीर सिंह Sameer Singh > > <lumarzeli30@gmail.com> > > Date: Sun, 29 May 2022 06:21:33 +0530 > > > > This time the Batak scripts are added to Emacs. > > Since the Batak scripts are actually a collection of five scripts: > > Toba, Karo, Pakpak, Mandailing, and Simalungun. > > I think the above are _languages_, not scripts. They all used in the > past to use the Batak script for writing, but they aren't scripts. > The term "Batak" is not just the name for its script, it is a collective term used for the tribes in Sumatra. So adding "Batak" to the language name like the Batak Karo language or the Batak Pakpak language is also correct. Though not adding it also seems fine. For e.g. check the Indonesian Wikipedia page for these languages. https://id.wikipedia.org/wiki/Bahasa_Karo Bahasa Batak Karo atau bahasa Karo adalah sebuah bahasa Austronesia dalam rumpun bahasa Batak. (Tl. Karo Batak language _or_ Karo language is an Austronesian language in the Batak language family) https://id.wikipedia.org/wiki/Bahasa_Mandailing Here the infobox uses Bahasa Batak Mandailing instead of just Bahasa Mandailing (here Bahasa means language which comes from the Sanskrit word भाषा bhāṣā) Is this greeting common to all the languages using the Batak script? > I don't think so. So perhaps we should have several greetings, one > for each language? > This greeting (Horas) is common in all but one language (Batak Karo). Even though it is the same in Batak Toba, Pakpak, Mandailing and Simalungun, there are slight variations in the way it is written. But still they represeneting may be sufficientt just one Unicode block, writing one gre to show that this script is supported. Though we can also have multiple greetings, it is up to you. See above: this should distinguish between the script name and the > language names. Something like > > **** Karo language using the Batak script and its language environment > See my first point Btw, according to this article: > > https://en.wikipedia.org/wiki/Batak_languages > > there are 2 more languages that used Batak; why aren't they included? > Sadly I could not find any information about them, the unicode proposals only talk about the five languages. Check the points 7.1 to 7.5 of this document https://www.unicode.org/wg2/docs/n3320.pdf There is no mention of Alas-kluet or Angkola. The input methods look almost identical, with a few minor deviations. > Are the differences real or are they mistakes? If they are mistakes, > we can have just one input method for all the languages using Batak. > And if the differences are real, can we still have only one input > method, where the different variants of the same ASCII letter are > selected by the user at typing time? It seems un-economical to have > so many input methods that are almost identical. > Ok, I will merge them into one input method. On Sun, May 29, 2022 at 12:43 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: समीर सिंह Sameer Singh > > <lumarzeli30@gmail.com> > > Date: Sun, 29 May 2022 06:21:33 +0530 > > > > This time the Batak scripts are added to Emacs. > > Since the Batak scripts are actually a collection of five scripts: > > Toba, Karo, Pakpak, Mandailing, and Simalungun. > > I think the above are _languages_, not scripts. They all used in the > past to use the Batak script for writing, but they aren't scripts. > > > I have provided 5 different language environments and input-methods for > them. > > > > Please review the patch. > > --- a/etc/HELLO > > +++ b/etc/HELLO > > @@ -28,6 +28,7 @@ Amharic (አማርኛ) ሠላም > > Arabic (العربيّة) السّلام عليكم > > Armenian (հայերեն) Բարև ձեզ > > Balinese (ᬅᬓ᭄ᬱᬭᬩᬮᬶ) ᬒᬁᬲ᭄ᬯᬲ᭄ᬢ᭄ᬬᬲ᭄ᬢᬸ > > +Batak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲) ᯂᯬᯒᯘ᯲ > > Is this greeting common to all the languages using the Batak script? > I don't think so. So perhaps we should have several greetings, one > for each language? > > > --- a/etc/NEWS > > +++ b/etc/NEWS > > @@ -826,6 +826,11 @@ corresponding language environments are: > > **** Balinese script and language environment > > **** Javanese script and language environment > > **** Sundanese script and language environment > > +**** Batak Karo script and language environment > > +**** Batak Toba script and language environment > > +**** Batak Pakpak script and language environment > > +**** Batak Mandailing script and language environment > > +**** Batak Simalungun script and language environment > > See above: this should distinguish between the script name and the > language names. Something like > > **** Karo language using the Batak script and its language environment > > > +(set-language-info-alist > > + "Batak Karo" '((charset unicode) > > + (coding-system utf-8) > > + (coding-priority utf-8) > > + (input-method . "batak-karo") > > + (sample-text . "Batak Karo (ᯘᯬᯒᯗ᯳ᯆᯗᯂ᯳) ᯔᯧᯐᯬᯀᯱᯐᯬᯀᯱ") > > + (documentation . "\ > > +Batak Karo language and its script are supported in this language > environment."))) > > Likewise here. The doc string should say something like > > Karo language using the Batak script is supported in this language > environment. > > > + > > +(set-language-info-alist > > + "Batak Toba" '((charset unicode) > > + (coding-system utf-8) > > + (coding-priority utf-8) > > + (input-method . "batak-toba") > > + (sample-text . "Batak Toba (ᯘᯮᯮᯒᯖ᯲ᯅᯖᯂ᯲) ᯂᯬᯒᯘ᯲") > > + (documentation . "\ > > +Batak Toba language and its script are supported in this language > environment."))) > > + > > +(set-language-info-alist > > + "Batak Pakpak" '((charset unicode) > > + (coding-system utf-8) > > + (coding-priority utf-8) > > + (input-method . "batak-pakpak") > > + (sample-text . "Batak Pakpak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲) ᯂᯬᯒᯘ᯲") > > + (documentation . "\ > > +Batak Pakpak language and its script are supported in this language > environment."))) > > + > > +(set-language-info-alist > > + "Batak Mandailing" '((charset unicode) > > + (coding-system utf-8) > > + (coding-priority utf-8) > > + (input-method . "batak-mandailing") > > + (sample-text . "Batak Mandailing (ᯚᯮᯒᯖ᯲ᯅᯖᯄᯱ᯲) > ᯄᯬᯒᯚ᯲") > > + (documentation . "\ > > +Batak Mandailing language and its script are supported in this language > environment."))) > > + > > +(set-language-info-alist > > + "Batak Simalungun" '((charset unicode) > > + (coding-system utf-8) > > + (coding-priority utf-8) > > + (input-method . "batak-simalungun") > > + (sample-text . "Batak Simalungun (ᯙᯮᯮᯓᯖ᯳ᯅᯖᯃ᯳) > ᯃᯬᯓᯙ᯲") > > + (documentation . "\ > > +Batak Simalungun language and its script are supported in this language > environment."))) > > + > > Btw, according to this article: > > https://en.wikipedia.org/wiki/Batak_languages > > there are 2 more languages that used Batak; why aren't they included? > > > +(quail-define-package > > + "batak-karo" "Batak Karo" "ᯂᯒᯭ" nil "Batak Karo phonetic input method." > > + nil t t t t nil nil nil nil nil t) > > + > > +(quail-define-rules > > The input methods look almost identical, with a few minor deviations. > Are the differences real or are they mistakes? If they are mistakes, > we can have just one input method for all the languages using Batak. > And if the differences are real, can we still have only one input > method, where the different variants of the same ASCII letter are > selected by the user at typing time? It seems un-economical to have > so many input methods that are almost identical. > > Thanks. > [-- Attachment #2: Type: text/html, Size: 11790 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#55694: [PATCH] Add support for the Batak scripts 2022-05-29 11:44 ` समीर सिंह Sameer Singh @ 2022-05-29 12:07 ` Eli Zaretskii 2022-05-29 12:33 ` समीर सिंह Sameer Singh 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2022-05-29 12:07 UTC (permalink / raw) To: समीर सिंह Sameer Singh Cc: 55694 > From: समीर सिंह Sameer Singh <lumarzeli30@gmail.com> > Date: Sun, 29 May 2022 17:14:57 +0530 > Cc: 55694@debbugs.gnu.org > > > This time the Batak scripts are added to Emacs. > > Since the Batak scripts are actually a collection of five scripts: > > Toba, Karo, Pakpak, Mandailing, and Simalungun. > > I think the above are _languages_, not scripts. They all used in the > past to use the Batak script for writing, but they aren't scripts. > > The term "Batak" is not just the name for its script, it is a collective term used for the tribes in Sumatra. > So adding "Batak" to the language name like the Batak Karo language or the Batak Pakpak language is also > correct. > Though not adding it also seems fine. I'd prefer to go with a shorter names, as they definitely are being used, see the Wikipedia article I mentioned. > Is this greeting common to all the languages using the Batak script? > I don't think so. So perhaps we should have several greetings, one > for each language? > > This greeting (Horas) is common in all but one language (Batak Karo). > Even though it is the same in Batak Toba, Pakpak, Mandailing and Simalungun, there are slight variations in > the way it is written. > But still they represeneting may be sufficientt just one Unicode block, writing one gre to show that this script > is supported. > Though we can also have multiple greetings, it is up to you. Let's have two greetings, by adding the one for Karo. > The input methods look almost identical, with a few minor deviations. > Are the differences real or are they mistakes? If they are mistakes, > we can have just one input method for all the languages using Batak. > And if the differences are real, can we still have only one input > method, where the different variants of the same ASCII letter are > selected by the user at typing time? It seems un-economical to have > so many input methods that are almost identical. > > Ok, I will merge them into one input method. Thanks! ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#55694: [PATCH] Add support for the Batak scripts 2022-05-29 12:07 ` Eli Zaretskii @ 2022-05-29 12:33 ` समीर सिंह Sameer Singh 2022-05-29 12:58 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: समीर सिंह Sameer Singh @ 2022-05-29 12:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 55694 [-- Attachment #1: Type: text/plain, Size: 2434 bytes --] Thank you, I will implement all these changes, also now is there any need for five different language environments and news entries? The only reason I separated them because of the different input methods. On Sun, May 29, 2022 at 5:37 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: समीर सिंह Sameer Singh <lumarzeli30@gmail.com> > > Date: Sun, 29 May 2022 17:14:57 +0530 > > Cc: 55694@debbugs.gnu.org > > > > > This time the Batak scripts are added to Emacs. > > > Since the Batak scripts are actually a collection of five scripts: > > > Toba, Karo, Pakpak, Mandailing, and Simalungun. > > > > I think the above are _languages_, not scripts. They all used in the > > past to use the Batak script for writing, but they aren't scripts. > > > > The term "Batak" is not just the name for its script, it is a collective > term used for the tribes in Sumatra. > > So adding "Batak" to the language name like the Batak Karo language or > the Batak Pakpak language is also > > correct. > > Though not adding it also seems fine. > > I'd prefer to go with a shorter names, as they definitely are being > used, see the Wikipedia article I mentioned. > > > Is this greeting common to all the languages using the Batak script? > > I don't think so. So perhaps we should have several greetings, one > > for each language? > > > > This greeting (Horas) is common in all but one language (Batak Karo). > > Even though it is the same in Batak Toba, Pakpak, Mandailing and > Simalungun, there are slight variations in > > the way it is written. > > But still they represeneting may be sufficientt just one Unicode block, > writing one gre to show that this script > > is supported. > > Though we can also have multiple greetings, it is up to you. > > Let's have two greetings, by adding the one for Karo. > > > The input methods look almost identical, with a few minor deviations. > > Are the differences real or are they mistakes? If they are mistakes, > > we can have just one input method for all the languages using Batak. > > And if the differences are real, can we still have only one input > > method, where the different variants of the same ASCII letter are > > selected by the user at typing time? It seems un-economical to have > > so many input methods that are almost identical. > > > > Ok, I will merge them into one input method. > > Thanks! > [-- Attachment #2: Type: text/html, Size: 3093 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#55694: [PATCH] Add support for the Batak scripts 2022-05-29 12:33 ` समीर सिंह Sameer Singh @ 2022-05-29 12:58 ` Eli Zaretskii 2022-05-29 13:34 ` समीर सिंह Sameer Singh 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2022-05-29 12:58 UTC (permalink / raw) To: समीर सिंह Sameer Singh Cc: 55694 > From: समीर सिंह Sameer Singh <lumarzeli30@gmail.com> > Date: Sun, 29 May 2022 18:03:19 +0530 > Cc: 55694@debbugs.gnu.org > > Thank you, I will implement all these changes, also now is there any need for five different language > environments and news entries? > The only reason I separated them because of the different input methods. I guess a single language environment is enough, but please mention the languages we support both in the doc string of the language environment and in the doc string of the input method. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#55694: [PATCH] Add support for the Batak scripts 2022-05-29 12:58 ` Eli Zaretskii @ 2022-05-29 13:34 ` समीर सिंह Sameer Singh 2022-05-29 16:37 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: समीर सिंह Sameer Singh @ 2022-05-29 13:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 55694 [-- Attachment #1.1: Type: text/plain, Size: 954 bytes --] I have rewrote the patch, please review it. I have also not decided to add the Karo greeting to etc/HELLO since it is no longer a language environment, but if you wish to do that, I have written it below for copy pasting. Batak Karo (ᯘᯬᯒᯗ᯳ᯆᯗᯂ᯳) ᯔᯧᯐᯬᯀᯱᯐᯬᯀᯱ On Sun, May 29, 2022 at 6:28 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: समीर सिंह Sameer Singh <lumarzeli30@gmail.com> > > Date: Sun, 29 May 2022 18:03:19 +0530 > > Cc: 55694@debbugs.gnu.org > > > > Thank you, I will implement all these changes, also now is there any > need for five different language > > environments and news entries? > > The only reason I separated them because of the different input methods. > > I guess a single language environment is enough, but please mention > the languages we support both in the doc string of the language > environment and in the doc string of the input method. > [-- Attachment #1.2: Type: text/html, Size: 1418 bytes --] [-- Attachment #2: 0001-Add-support-for-the-Batak-script-bug-55694.patch --] [-- Type: text/x-patch, Size: 6735 bytes --] From 1820e80c48004e27bbaa1bcd219965bedb2bc997 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0=20=E0=A4=B8=E0=A4=BF?= =?UTF-8?q?=E0=A4=82=E0=A4=B9=20Sameer=20Singh?= <lumarzeli30@gmail.com> Date: Sun, 29 May 2022 18:55:58 +0530 Subject: [PATCH] Add support for the Batak script (bug #55694) * lisp/language/indonesian.el ("Batak"): New language environment. Add composition rules for Batak. Add sample text and input method. * lisp/international/fontset.el (script-representative-chars) (setup-default-fontset): Support Batak. * lisp/leim/quail/indonesian.el ("batak"): New input method. Rename TITLE of balinese, javanese and sundanese in ("quail-define-package"). * etc/HELLO: Add a Batak greeting. * etc/NEWS: Announce the new language environment and its input method. --- etc/HELLO | 1 + etc/NEWS | 1 + lisp/international/fontset.el | 2 + lisp/language/indonesian.el | 24 ++++++++++++ lisp/leim/quail/indonesian.el | 72 +++++++++++++++++++++++++++++++++-- 5 files changed, 97 insertions(+), 3 deletions(-) diff --git a/etc/HELLO b/etc/HELLO index 4ff066847d..c89f8943c7 100644 --- a/etc/HELLO +++ b/etc/HELLO @@ -28,6 +28,7 @@ Amharic (አማርኛ) ሠላም Arabic (العربيّة) السّلام عليكم Armenian (հայերեն) Բարև ձեզ Balinese (ᬅᬓ᭄ᬱᬭᬩᬮᬶ) ᬒᬁᬲ᭄ᬯᬲ᭄ᬢ᭄ᬬᬲ᭄ᬢᬸ +Batak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲) ᯂᯬᯒᯘ᯲ Belarusian (беларуская) Прывітанне Bengali (বাংলা) নমস্কার Brahmi (𑀩𑁆𑀭𑀸𑀳𑁆𑀫𑀻) 𑀦𑀫𑀲𑁆𑀢𑁂 diff --git a/etc/NEWS b/etc/NEWS index d8d22449f7..5987acdac9 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -836,6 +836,7 @@ corresponding language environments are: **** Balinese script and language environment **** Javanese script and language environment **** Sundanese script and language environment +**** Batak script and language environment --- *** The "Oriya" language environment was renamed to "Odia". diff --git a/lisp/international/fontset.el b/lisp/international/fontset.el index 00ee0cf475..bf4b9b578e 100644 --- a/lisp/international/fontset.el +++ b/lisp/international/fontset.el @@ -191,6 +191,7 @@ font-encoding-charset-alist (limbu #x1901 #x1920 #x1936) (balinese #x1B13 #x1B35 #x1B5E) (sundanese #x1B8A #x1BAB #x1CC4) + (batak #x1BC2 #x1BE7 #x1BFF) (tai-le #x1950) (tai-lue #x1980) (tai-tham #x1A20 #x1A55 #x1A61 #x1A80) @@ -762,6 +763,7 @@ setup-default-fontset limbu balinese sundanese + batak symbol braille yi diff --git a/lisp/language/indonesian.el b/lisp/language/indonesian.el index 4bdcd0a49c..319ec48158 100644 --- a/lisp/language/indonesian.el +++ b/lisp/language/indonesian.el @@ -54,6 +54,16 @@ (documentation . "\ Sundanese language and its script are supported in this language environment."))) +(set-language-info-alist + "Batak" '((charset unicode) + (coding-system utf-8) + (coding-priority utf-8) + (input-method . "batak") + (sample-text . "Batak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲) ᯂᯬᯒᯘ᯲") + (documentation . "\ +Such languages using the Batak script such as Karo, Toba, Pakpak, Mandailing +and Simalungun are supported in this language environment."))) + ;; Balinese composition rules (let ((consonant "[\x1B13-\x1B33\x1B45-\x1B4B]") (independent-vowel "[\x1B05-\x1B12]") @@ -119,5 +129,19 @@ vowel "?" modifier-above "?" dependant-consonant "?") 1 'font-shape-gstring)))) +;; Batak composition rules +(let ((akshara "[\x1BC0-\x1BE5]") + (vowel "[\x1BE7-\x1BEF]") + (dependant-consonant "[\x1BF0\x1BF1]") + (modifier-above "\x1BE6") + (virama "[\x1BF2\x1BF3]")) + (set-char-table-range composition-function-table + '(#x1BE6 . #x1BF3) + (list (vector + ;; Akshara based syllables + (concat akshara virama "?" vowel "*" modifier-above + "?" dependant-consonant "?") + 1 'font-shape-gstring)))) + (provide 'indonesian) ;;; indonesian.el ends here diff --git a/lisp/leim/quail/indonesian.el b/lisp/leim/quail/indonesian.el index 3a0654db90..fd232c4f71 100644 --- a/lisp/leim/quail/indonesian.el +++ b/lisp/leim/quail/indonesian.el @@ -32,7 +32,7 @@ ;; Javanese. (quail-define-package - "balinese" "Balinese" "ᬅ" t "Balinese phonetic input method. + "balinese" "Balinese" "ᬩ" t "Balinese phonetic input method. `\\=`' is used to switch levels instead of Alt-Gr. " nil t t t t nil nil nil nil nil t) @@ -174,7 +174,7 @@ ("`M" ?ᬀ)) (quail-define-package - "javanese" "Javanese" "ꦄ" t "Javanese phonetic input method. + "javanese" "Javanese" "ꦗ" t "Javanese phonetic input method. `\\=`' is used to switch levels instead of Alt-Gr. " nil t t t t nil nil nil nil nil t) @@ -287,7 +287,7 @@ ("`m" ?ꦀ)) (quail-define-package - "sundanese" "Sundanese" "ᮃ" t "Sundanese phonetic input method. + "sundanese" "Sundanese" "ᮞᮥ" t "Sundanese phonetic input method. `\\=`' is used to switch levels instead of Alt-Gr. " nil t t t t nil nil nil nil nil t) @@ -377,5 +377,71 @@ ("`m" ?ᮿ) ("`M" ?ᮬ)) +(quail-define-package + "batak" "Batak" "ᯅ" t "Batak phonetic input method, + used by languages such as Karo, Toba, Pakpak, Mandailing + and Simalungun. + + `\\=`' is used to switch levels instead of Alt-Gr. +" nil t t t t nil nil nil nil nil t) + +(quail-define-rules + ("q" ?᯼) + ("Q" ?᯽) + ("w" ?ᯋ) + ("W" ?ᯌ) + ("`w" ?ᯍ) + ("e" ?ᯧ) + ("E" ?ᯨ) + ("`e" ?ᯩ) + ("r" ?ᯒ) + ("R" ?ᯓ) + ("t" ?ᯖ) + ("T" ?ᯗ) + ("y" ?ᯛ) + ("Y" ?ᯜ) + ("u" ?ᯮ) + ("U" ?ᯥ) + ("`u" ?ᯯ) + ("i" ?ᯪ) + ("I" ?ᯫ) + ("`i" ?ᯤ) + ("o" ?ᯬ) + ("O" ?ᯭ) + ("p" ?ᯇ) + ("P" ?ᯈ) + ("a" ?ᯀ) + ("A" ?ᯁ) + ("s" ?ᯘ) + ("S" ?ᯙ) + ("`s" ?ᯚ) + ("d" ?ᯑ) + ("f" ?᯲) + ("F" ?᯳) + ("g" ?ᯎ) + ("G" ?ᯏ) + ("h" ?ᯂ) + ("H" ?ᯃ) + ("`h" ?ᯄ) + ("`H" ?ᯱ) + ("j" ?ᯐ) + ("k" ?᯦) + ("l" ?ᯞ) + ("L" ?ᯟ) + ("z" ?ᯝ) + ("Z" ?ᯰ) + ("x" ?ᯠ) + ("c" ?ᯡ) + ("v" ?᯾) + ("V" ?᯿) + ("b" ?ᯅ) + ("B" ?ᯆ) + ("n" ?ᯉ) + ("N" ?ᯊ) + ("`n" ?ᯢ) + ("m" ?ᯔ) + ("M" ?ᯕ) + ("`m" ?ᯣ)) + (provide 'indonesian) ;;; indonesian.el ends here -- 2.36.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* bug#55694: [PATCH] Add support for the Batak scripts 2022-05-29 13:34 ` समीर सिंह Sameer Singh @ 2022-05-29 16:37 ` Eli Zaretskii 2022-05-29 16:43 ` समीर सिंह Sameer Singh 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2022-05-29 16:37 UTC (permalink / raw) To: समीर सिंह Sameer Singh Cc: 55694-done > From: समीर सिंह Sameer Singh <lumarzeli30@gmail.com> > Date: Sun, 29 May 2022 19:04:38 +0530 > Cc: 55694-done@debbugs.gnu.org > > I have rewrote the patch, please review it. Thanks, installed. > I have also not decided to add the Karo greeting to etc/HELLO since it is no longer a language environment, > but if you wish to do that, I have written it below for copy pasting. > Batak Karo (ᯘᯬᯒᯗ᯳ᯆᯗᯂ᯳) ᯔᯧᯐᯬᯀᯱᯐᯬᯀᯱ Thanks. I didn't mean a separate entry, I meant to add one more greeting. I've now done that, please see my followup commit. If something is wrong with how I added it, please tell. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#55694: [PATCH] Add support for the Batak scripts 2022-05-29 16:37 ` Eli Zaretskii @ 2022-05-29 16:43 ` समीर सिंह Sameer Singh 2022-05-29 17:13 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: समीर सिंह Sameer Singh @ 2022-05-29 16:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 55694-done [-- Attachment #1: Type: text/plain, Size: 1295 bytes --] >Thanks. I didn't mean a separate entry, I meant to add one >more greeting. Sorry, it was my mistake, I should have realised it sooner. >I've now done that, please see my followup commit. If >something is wrong with how I added it, please tell. Maybe you can use the forward slash instead of the comma to separate the greetings, because we already use it for other places where multiple greetings are written, other than that it's all good. Thank you for adding the patch. रवि, 29 मई 2022, 10:07 pm को Eli Zaretskii <eliz@gnu.org> ने लिखा: > > From: समीर सिंह Sameer Singh <lumarzeli30@gmail.com> > > Date: Sun, 29 May 2022 19:04:38 +0530 > > Cc: 55694-done@debbugs.gnu.org > > > > I have rewrote the patch, please review it. > > Thanks, installed. > > > I have also not decided to add the Karo greeting to etc/HELLO since it > is no longer a language environment, > > but if you wish to do that, I have written it below for copy pasting. > > Batak Karo (ᯘᯬᯒᯗ᯳ᯆᯗᯂ᯳) ᯔᯧᯐᯬᯀᯱᯐᯬᯀᯱ > > Thanks. I didn't mean a separate entry, I meant to add one more > greeting. I've now done that, please see my followup commit. If > something is wrong with how I added it, please tell. > [-- Attachment #2: Type: text/html, Size: 2085 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#55694: [PATCH] Add support for the Batak scripts 2022-05-29 16:43 ` समीर सिंह Sameer Singh @ 2022-05-29 17:13 ` Eli Zaretskii 0 siblings, 0 replies; 10+ messages in thread From: Eli Zaretskii @ 2022-05-29 17:13 UTC (permalink / raw) To: समीर सिंह Sameer Singh Cc: 55694-done > From: समीर सिंह Sameer Singh <lumarzeli30@gmail.com> > Date: Sun, 29 May 2022 22:13:22 +0530 > Cc: 55694-done@debbugs.gnu.org > > >I've now done that, please see my followup commit. If > >something is wrong with how I added it, please tell. > > Maybe you can use the forward slash instead of the comma to separate the greetings, because we already > use it for other places where multiple greetings are written, other than that it's all good. Done, thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-05-29 17:13 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-05-29 0:51 bug#55694: [PATCH] Add support for the Batak scripts समीर सिंह Sameer Singh 2022-05-29 7:13 ` Eli Zaretskii 2022-05-29 11:44 ` समीर सिंह Sameer Singh 2022-05-29 12:07 ` Eli Zaretskii 2022-05-29 12:33 ` समीर सिंह Sameer Singh 2022-05-29 12:58 ` Eli Zaretskii 2022-05-29 13:34 ` समीर सिंह Sameer Singh 2022-05-29 16:37 ` Eli Zaretskii 2022-05-29 16:43 ` समीर सिंह Sameer Singh 2022-05-29 17:13 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).