unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "समीर सिंह Sameer Singh" <lumarzeli30@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 55694@debbugs.gnu.org
Subject: bug#55694: [PATCH] Add support for the Batak scripts
Date: Sun, 29 May 2022 17:14:57 +0530	[thread overview]
Message-ID: <CAOR1sLwiGrC1Dm5y52ZhKuNCNXiH6fYc1W_rkJq37dYuitzetw@mail.gmail.com> (raw)
In-Reply-To: <835ylolsk0.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 8372 bytes --]

Thank you for the feedback!

> From: समीर सिंह Sameer Singh
> >  <lumarzeli30@gmail.com>
> > Date: Sun, 29 May 2022 06:21:33 +0530
> >
> > This time the Batak scripts are added to Emacs.
> > Since the Batak scripts are actually a collection of five scripts:
> > Toba, Karo, Pakpak, Mandailing, and Simalungun.
>
> I think the above are _languages_, not scripts.  They all used in the
> past to use the Batak script for writing, but they aren't scripts.
>

The term "Batak" is not just the name for its script, it is a collective
term used for the tribes in Sumatra.
So adding "Batak" to the language name like the Batak Karo language or the
Batak Pakpak language is also correct.
Though not adding it also seems fine.

For e.g. check the Indonesian Wikipedia page for these languages.

https://id.wikipedia.org/wiki/Bahasa_Karo
Bahasa Batak Karo atau bahasa Karo adalah sebuah bahasa Austronesia dalam
rumpun bahasa Batak.
(Tl. Karo Batak language _or_ Karo language is an Austronesian language in
the Batak language family)

https://id.wikipedia.org/wiki/Bahasa_Mandailing
Here the infobox uses Bahasa Batak Mandailing instead of just Bahasa
Mandailing (here Bahasa means language which comes from the Sanskrit word
भाषा bhāṣā)

Is this greeting common to all the languages using the Batak script?
> I don't think so.  So perhaps we should have several greetings, one
> for each language?
>

This greeting (Horas) is common in all but one language (Batak Karo).
Even though it is the same in Batak Toba, Pakpak, Mandailing and
Simalungun, there are slight variations in the way it is written.
But still they represeneting may be sufficientt just one Unicode block,
writing one gre to show that this script is supported.
Though we can also have multiple greetings, it is up to you.

See above: this should distinguish between the script name and the
> language names.  Something like
>
>   **** Karo language using the Batak script and its language environment
>

See my first point

Btw, according to this article:
>
>   https://en.wikipedia.org/wiki/Batak_languages
>
> there are 2 more languages that used Batak; why aren't they included?
>

Sadly I could not find any information about them, the unicode proposals
only talk about the five languages.
Check the points 7.1 to 7.5 of this document
https://www.unicode.org/wg2/docs/n3320.pdf
There is no mention of Alas-kluet or Angkola.

The input methods look almost identical, with a few minor deviations.
> Are the differences real or are they mistakes?  If they are mistakes,
> we can have just one input method for all the languages using Batak.
> And if the differences are real, can we still have only one input
> method, where the different variants of the same ASCII letter are
> selected by the user at typing time?  It seems un-economical to have
> so many input methods that are almost identical.
>

Ok, I will merge them into one input method.


On Sun, May 29, 2022 at 12:43 PM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: समीर सिंह Sameer Singh
> >  <lumarzeli30@gmail.com>
> > Date: Sun, 29 May 2022 06:21:33 +0530
> >
> > This time the Batak scripts are added to Emacs.
> > Since the Batak scripts are actually a collection of five scripts:
> > Toba, Karo, Pakpak, Mandailing, and Simalungun.
>
> I think the above are _languages_, not scripts.  They all used in the
> past to use the Batak script for writing, but they aren't scripts.
>
> > I have provided 5 different language environments and input-methods for
> them.
> >
> > Please review the patch.
> > --- a/etc/HELLO
> > +++ b/etc/HELLO
> > @@ -28,6 +28,7 @@ Amharic (አማርኛ)      ሠላም
> >  Arabic (العربيّة)    السّلام عليكم
> >  Armenian (հայերեն)   Բարև ձեզ
> >  Balinese (ᬅᬓ᭄ᬱᬭᬩᬮᬶ)  ᬒᬁᬲ᭄ᬯᬲ᭄ᬢ᭄ᬬᬲ᭄ᬢᬸ
> > +Batak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲)    ᯂᯬᯒᯘ᯲
>
> Is this greeting common to all the languages using the Batak script?
> I don't think so.  So perhaps we should have several greetings, one
> for each language?
>
> > --- a/etc/NEWS
> > +++ b/etc/NEWS
> > @@ -826,6 +826,11 @@ corresponding language environments are:
> >  **** Balinese script and language environment
> >  **** Javanese script and language environment
> >  **** Sundanese script and language environment
> > +**** Batak Karo script and language environment
> > +**** Batak Toba script and language environment
> > +**** Batak Pakpak script and language environment
> > +**** Batak Mandailing script and language environment
> > +**** Batak Simalungun script and language environment
>
> See above: this should distinguish between the script name and the
> language names.  Something like
>
>   **** Karo language using the Batak script and its language environment
>
> > +(set-language-info-alist
> > + "Batak Karo" '((charset unicode)
> > +                (coding-system utf-8)
> > +                (coding-priority utf-8)
> > +                (input-method . "batak-karo")
> > +                (sample-text . "Batak Karo (ᯘᯬᯒᯗ᯳ᯆᯗᯂ᯳)    ᯔᯧᯐᯬᯀᯱᯐᯬᯀᯱ")
> > +                (documentation . "\
> > +Batak Karo language and its script are supported in this language
> environment.")))
>
> Likewise here.  The doc string should say something like
>
>   Karo language using the Batak script is supported in this language
>   environment.
>
> > +
> > +(set-language-info-alist
> > + "Batak Toba" '((charset unicode)
> > +                (coding-system utf-8)
> > +                (coding-priority utf-8)
> > +                (input-method . "batak-toba")
> > +                (sample-text . "Batak Toba (ᯘᯮᯮᯒᯖ᯲ᯅᯖᯂ᯲)    ᯂᯬᯒᯘ᯲")
> > +                (documentation . "\
> > +Batak Toba language and its script are supported in this language
> environment.")))
> > +
> > +(set-language-info-alist
> > + "Batak Pakpak" '((charset unicode)
> > +                  (coding-system utf-8)
> > +                  (coding-priority utf-8)
> > +                  (input-method . "batak-pakpak")
> > +                  (sample-text . "Batak Pakpak (ᯘᯮᯒᯗ᯲ᯅᯗᯂ᯲)    ᯂᯬᯒᯘ᯲")
> > +                  (documentation . "\
> > +Batak Pakpak language and its script are supported in this language
> environment.")))
> > +
> > +(set-language-info-alist
> > + "Batak Mandailing" '((charset unicode)
> > +                      (coding-system utf-8)
> > +                      (coding-priority utf-8)
> > +                      (input-method . "batak-mandailing")
> > +                      (sample-text . "Batak Mandailing (ᯚᯮᯒᯖ᯲ᯅᯖᯄᯱ᯲)
> ᯄᯬᯒᯚ᯲")
> > +                      (documentation . "\
> > +Batak Mandailing language and its script are supported in this language
> environment.")))
> > +
> > +(set-language-info-alist
> > + "Batak Simalungun" '((charset unicode)
> > +                      (coding-system utf-8)
> > +                      (coding-priority utf-8)
> > +                      (input-method . "batak-simalungun")
> > +                      (sample-text . "Batak Simalungun (ᯙᯮᯮᯓᯖ᯳ᯅᯖᯃ᯳)
> ᯃᯬᯓᯙ᯲")
> > +                      (documentation . "\
> > +Batak Simalungun language and its script are supported in this language
> environment.")))
> > +
>
> Btw, according to this article:
>
>   https://en.wikipedia.org/wiki/Batak_languages
>
> there are 2 more languages that used Batak; why aren't they included?
>
> > +(quail-define-package
> > + "batak-karo" "Batak Karo" "ᯂᯒᯭ" nil "Batak Karo phonetic input method."
> > + nil t t t t nil nil nil nil nil t)
> > +
> > +(quail-define-rules
>
> The input methods look almost identical, with a few minor deviations.
> Are the differences real or are they mistakes?  If they are mistakes,
> we can have just one input method for all the languages using Batak.
> And if the differences are real, can we still have only one input
> method, where the different variants of the same ASCII letter are
> selected by the user at typing time?  It seems un-economical to have
> so many input methods that are almost identical.
>
> Thanks.
>

[-- Attachment #2: Type: text/html, Size: 11790 bytes --]

  reply	other threads:[~2022-05-29 11:44 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-29  0:51 bug#55694: [PATCH] Add support for the Batak scripts समीर सिंह Sameer Singh
2022-05-29  7:13 ` Eli Zaretskii
2022-05-29 11:44   ` समीर सिंह Sameer Singh [this message]
2022-05-29 12:07     ` Eli Zaretskii
2022-05-29 12:33       ` समीर सिंह Sameer Singh
2022-05-29 12:58         ` Eli Zaretskii
2022-05-29 13:34           ` समीर सिंह Sameer Singh
2022-05-29 16:37             ` Eli Zaretskii
2022-05-29 16:43               ` समीर सिंह Sameer Singh
2022-05-29 17:13                 ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOR1sLwiGrC1Dm5y52ZhKuNCNXiH6fYc1W_rkJq37dYuitzetw@mail.gmail.com \
    --to=lumarzeli30@gmail.com \
    --cc=55694@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).