bug#58159: [PATCH] Add support for the Wancho script

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#58159: [PATCH] Add support for the Wancho script
@ 2022-09-29 11:07 समीर सिंह Sameer Singh
  2022-09-29 11:09 ` समीर सिंह Sameer Singh
  0 siblings, 1 reply; 17+ messages in thread
From: समीर सिंह Sameer Singh @ 2022-09-29 11:07 UTC (permalink / raw)
  To: 58159

[-- Attachment #1: Type: text/plain, Size: 618 bytes --]

The Wancho script is added to Emacs this time.

Also can we add something like this to etc/HELLO:
"Some of these greetings or the script name may be wrong or misspelled so
if you know the script, please help by correcting them."?

For many of these languages/scripts it is difficult to find their greetings
and most of the time if their greetings are available online they are in
the roman script so often I have to convert them into their native script
therefore these greetings may have a high chance of misspelling or they may
be wrong altogether so adding something like the above mentioned line may
help.

Thanks

[-- Attachment #2: Type: text/html, Size: 747 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-09-29 11:07 bug#58159: [PATCH] Add support for the Wancho script समीर सिंह Sameer Singh
@ 2022-09-29 11:09 ` समीर सिंह Sameer Singh
  2022-09-29 13:15   ` Eli Zaretskii
                     ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: समीर सिंह Sameer Singh @ 2022-09-29 11:09 UTC (permalink / raw)
  To: 58159


[-- Attachment #1.1: Type: text/plain, Size: 767 bytes --]

On Thu, Sep 29, 2022 at 4:37 PM समीर सिंह Sameer Singh <
lumarzeli30@gmail.com> wrote:

> The Wancho script is added to Emacs this time.
>
> Also can we add something like this to etc/HELLO:
> "Some of these greetings or the script name may be wrong or misspelled so
> if you know the script, please help by correcting them."?
>
> For many of these languages/scripts it is difficult to find their
> greetings and most of the time if their greetings are available online they
> are in the roman script so often I have to convert them into their native
> script therefore these greetings may have a high chance of misspelling or
> they may be wrong altogether so adding something like the above mentioned
> line may help.
>
> Thanks
>

[-- Attachment #1.2: Type: text/html, Size: 1137 bytes --]

[-- Attachment #2: 0001-Add-support-for-the-Wancho-script-bug-58159.patch --]
[-- Type: text/x-patch, Size: 4733 bytes --]

From aff36cedc83384647be1bbea4f076c8738c0326c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0=20=E0=A4=B8=E0=A4=BF?=
 =?UTF-8?q?=E0=A4=82=E0=A4=B9=20Sameer=20Singh?= <lumarzeli30@gmail.com>
Date: Thu, 29 Sep 2022 16:33:10 +0530
Subject: [PATCH] Add support for the Wancho script (bug#58159)

* lisp/language/indian.el ("Wancho"): New language environment.
Add sample text and input method.
* lisp/international/fontset.el (script-representative-chars)
(setup-default-fontset): Support Wancho.
* lisp/leim/quail/indian.el ("wancho"): New input method.

* etc/HELLO: Add a Wancho greeting.
* etc/NEWS: Announce the new language environment.
---
 etc/HELLO                     |  1 +
 etc/NEWS                      |  1 +
 lisp/international/fontset.el |  3 +-
 lisp/language/indian.el       | 11 +++++
 lisp/leim/quail/indian.el     | 77 +++++++++++++++++++++++++++++++++++
 5 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/etc/HELLO b/etc/HELLO
index 7e0e847521..1899b087e0 100644
--- a/etc/HELLO
+++ b/etc/HELLO
@@ -116,6 +116,7 @@ Turkish (Türkçe)	Merhaba
 Ukrainian (українська)	Вітаю
 Vietnamese (tiếng Việt)	Chào bạn
 
+Wancho (𞋒𞋀𞋉𞋃𞋕)    	𞋂𞋈𞋛
 
 
 <x-charset><param>japanese-jisx0208</param>Japanese (日本語)	こんにちは</x-charset> <x-charset><param>katakana-jisx0201</param>/ ｺﾝﾆﾁﾊ
diff --git a/etc/NEWS b/etc/NEWS
index 97eac058f1..4bab95da51 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1261,6 +1261,7 @@ Lepcha script and language environment
 Meetei Mayek script and language environment
 Adlam script and language environment
 Mende Kikakui script and language environment
+Wancho script and language environment
 
 ---
 *** The "Oriya" language environment was renamed to "Odia".
diff --git a/lisp/international/fontset.el b/lisp/international/fontset.el
index 0028c454af..ec505a0903 100644
--- a/lisp/international/fontset.el
+++ b/lisp/international/fontset.el
@@ -292,7 +292,7 @@ font-encoding-charset-alist
 	(counting-rod-numeral #x1D360)
 	(nyiakeng-puachue-hmong #x1e100)
 	(toto #x1E290)
-	(wancho #x1e2c0)
+	(wancho #x1E2C0 #x1E2E8 #x1E2EF)
         (nag-mundari #x1E4D0 #x1E4EB #x1E4F0)
 	(mende-kikakui #x1E810 #x1E8A6)
 	(adlam #x1E900 #x1E943)
@@ -832,6 +832,7 @@ setup-default-fontset
 		    tai-xuan-jing-symbol
 		    counting-rod-numeral
                     toto
+                    wancho
                     nag-mundari
                     mende-kikakui
 		    adlam
diff --git a/lisp/language/indian.el b/lisp/language/indian.el
index 81b7cbd99b..bc8f532857 100644
--- a/lisp/language/indian.el
+++ b/lisp/language/indian.el
@@ -266,6 +266,17 @@ 'devanagari
 language environment."))
  '("Indian"))
 
+(set-language-info-alist
+ "Wancho" '((charset unicode)
+            (coding-system utf-8)
+            (coding-priority utf-8)
+            (input-method . "wancho")
+            (sample-text . "Wancho (𞋒𞋀𞋉𞋃𞋕)	𞋂𞋈𞋛")
+            (documentation . "\
+Wancho language and its script are supported in this language
+environment."))
+ '("Indian"))
+
 ;; Replace mnemonic characters in REGEXP according to TABLE.  TABLE is
 ;; an alist of (MNEMONIC-STRING . REPLACEMENT-STRING).
 
diff --git a/lisp/leim/quail/indian.el b/lisp/leim/quail/indian.el
index 431d8369c1..443488c18d 100644
--- a/lisp/leim/quail/indian.el
+++ b/lisp/leim/quail/indian.el
@@ -2134,5 +2134,82 @@ "||"
  ("`m" ?ꫲ)
  ("`?" ?꫱))
 
+(quail-define-package
+ "wancho" "Wancho" "𞋒" t "Wancho phonetic input method.
+
+ `\\=`' is used to switch levels instead of Alt-Gr."
+ nil t t t t nil nil nil nil nil t)
+
+(quail-define-rules
+ ("``" ?𞋿)
+ ("1"  ?𞋱)
+ ("`1" ?1)
+ ("2"  ?𞋲)
+ ("`2" ?2)
+ ("3"  ?𞋳)
+ ("`3" ?3)
+ ("4"  ?𞋴)
+ ("`4" ?4)
+ ("5"  ?𞋵)
+ ("`5" ?5)
+ ("6"  ?𞋶)
+ ("`6" ?6)
+ ("7"  ?𞋷)
+ ("`7" ?7)
+ ("8"  ?𞋸)
+ ("`8" ?8)
+ ("9"  ?𞋹)
+ ("`9" ?9)
+ ("0"  ?𞋰)
+ ("`0" ?0)
+ ("q"  ?𞋠)
+ ("Q"  ?𞋡)
+ ("w"  ?𞋒)
+ ("e"  ?𞋛)
+ ("E"  ?𞋧)
+ ("r"  ?𞋗)
+ ("t"  ?𞋋)
+ ("T"  ?𞋌)
+ ("y"  ?𞋆)
+ ("Y"  ?𞋫)
+ ("u"  ?𞋞)
+ ("U"  ?𞋪)
+ ("i"  ?𞋜)
+ ("I"  ?𞋥)
+ ("o"  ?𞋕)
+ ("O"  ?𞋖)
+ ("`o" ?𞋢)
+ ("`O" ?𞋦)
+ ("p"  ?𞋊)
+ ("P"  ?𞋇)
+ ("a"  ?𞋁)
+ ("A"  ?𞋀)
+ ("`a" ?𞋤)
+ ("`A" ?𞋣)
+ ("s"  ?𞋎)
+ ("S"  ?𞋏)
+ ("d"  ?𞋄)
+ ("f"  ?𞋍)
+ ("g"  ?𞋅)
+ ("h"  ?𞋚)
+ ("j"  ?𞋐)
+ ("k"  ?𞋔)
+ ("K"  ?𞋙)
+ ("l"  ?𞋈)
+ ("L"  ?𞋟)
+ ("z"  ?𞋑)
+ ("x"  ?𞋩)
+ ("X"  ?𞋝)
+ ("c"  ?𞋃)
+ ("C"  ?𞋬)
+ ("v"  ?𞋓)
+ ("V"  ?𞋭)
+ ("b"  ?𞋂)
+ ("B"  ?𞋮)
+ ("n"  ?𞋉)
+ ("N"  ?𞋯)
+ ("m"  ?𞋘)
+ ("M"  ?𞋨))
+
 (provide 'indian)
 ;;; indian.el ends here
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-09-29 11:09 ` समीर सिंह Sameer Singh
@ 2022-09-29 13:15   ` Eli Zaretskii
  2022-09-29 13:21     ` समीर सिंह Sameer Singh
  2022-09-29 14:27   ` Robert Pluim
  2022-10-01  1:58   ` Richard Stallman
  2 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2022-09-29 13:15 UTC (permalink / raw)
  To: समीर सिंह Sameer Singh
  Cc: 58159-done

> From: समीर सिंह Sameer Singh
>  <lumarzeli30@gmail.com>
> Date: Thu, 29 Sep 2022 16:39:27 +0530
> 
>  Also can we add something like this to etc/HELLO:
>  "Some of these greetings or the script name may be wrong or misspelled so if you know the script,
>  please help by correcting them."?

We don't need to have a greeting for every language environment we
support.  So if we aren't sure how are greetings written, we could
just omit it.

>  For many of these languages/scripts it is difficult to find their greetings and most of the time if their
>  greetings are available online they are in the roman script so often I have to convert them into their
>  native script therefore these greetings may have a high chance of misspelling or they may be wrong
>  altogether so adding something like the above mentioned line may help.

I installed the patch, thanks.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-09-29 13:15   ` Eli Zaretskii
@ 2022-09-29 13:21     ` समीर सिंह Sameer Singh
  0 siblings, 0 replies; 17+ messages in thread
From: समीर सिंह Sameer Singh @ 2022-09-29 13:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58159-done

[-- Attachment #1: Type: text/plain, Size: 1238 bytes --]

>
> We don't need to have a greeting for every language environment we
> support.  So if we aren't sure how are greetings written, we could
> just omit it


Oh, Ok

> I installed the patch, thanks.

Great!


On Thu, Sep 29, 2022 at 6:46 PM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: समीर सिंह Sameer Singh
> >  <lumarzeli30@gmail.com>
> > Date: Thu, 29 Sep 2022 16:39:27 +0530
> >
> >  Also can we add something like this to etc/HELLO:
> >  "Some of these greetings or the script name may be wrong or misspelled
> so if you know the script,
> >  please help by correcting them."?
>
> We don't need to have a greeting for every language environment we
> support.  So if we aren't sure how are greetings written, we could
> just omit it.
>
> >  For many of these languages/scripts it is difficult to find their
> greetings and most of the time if their
> >  greetings are available online they are in the roman script so often I
> have to convert them into their
> >  native script therefore these greetings may have a high chance of
> misspelling or they may be wrong
> >  altogether so adding something like the above mentioned line may help.
>
> I installed the patch, thanks.
>

[-- Attachment #2: Type: text/html, Size: 1899 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-09-29 11:09 ` समीर सिंह Sameer Singh
  2022-09-29 13:15   ` Eli Zaretskii
@ 2022-09-29 14:27   ` Robert Pluim
  2022-09-29 15:19     ` समीर सिंह Sameer Singh
  2022-10-01  1:58   ` Richard Stallman
  2 siblings, 1 reply; 17+ messages in thread
From: Robert Pluim @ 2022-09-29 14:27 UTC (permalink / raw)
  To: समीर सिंह Sameer Singh
  Cc: 58159

>>>>> On Thu, 29 Sep 2022 16:39:27 +0530, समीर सिंह Sameer Singh <lumarzeli30@gmail.com> said:
    समीर> @@ -116,6 +116,7 @@ Turkish (Türkçe)	Merhaba
    समीर>  Ukrainian (українська)	Вітаю
    समीर>  Vietnamese (tiếng Việt)	Chào bạn
 
    समीर> +Wancho (𞋒𞋀𞋉𞋃𞋕)    	𞋂𞋈𞋛

Any reason for the newline between Vietnamese and Wancho?

    समीर>  	(toto #x1E290)

TIL thereʼs a script called 'toto', which is the French equivalent of
'foo' :-)

Thanks

Robert
-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-09-29 14:27   ` Robert Pluim
@ 2022-09-29 15:19     ` समीर सिंह Sameer Singh
  2022-09-29 15:41       ` Robert Pluim
  0 siblings, 1 reply; 17+ messages in thread
From: समीर सिंह Sameer Singh @ 2022-09-29 15:19 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 58159

[-- Attachment #1: Type: text/plain, Size: 1004 bytes --]

>
> Any reason for the newline between Vietnamese and Wancho?


This was not intentional, enriched-mode automatically adds a newline, most
of the time I remove it, this time
it may have skipped past my eyes.
see: https://mail.gnu.org/archive/html/bug-gnu-emacs/2022-05/msg00581.html

On Thu, Sep 29, 2022 at 7:57 PM Robert Pluim <rpluim@gmail.com> wrote:

> >>>>> On Thu, 29 Sep 2022 16:39:27 +0530, समीर सिंह Sameer Singh <
> lumarzeli30@gmail.com> said:
>     समीर> @@ -116,6 +116,7 @@ Turkish (Türkçe)  Merhaba
>     समीर>  Ukrainian (українська)       Вітаю
>     समीर>  Vietnamese (tiếng Việt)      Chào bạn
>
>     समीर> +Wancho (𞋒𞋀𞋉𞋃𞋕)          𞋂𞋈𞋛
>
> Any reason for the newline between Vietnamese and Wancho?
>
>     समीर>       (toto #x1E290)
>
> TIL thereʼs a script called 'toto', which is the French equivalent of
> 'foo' :-)
>
> Thanks
>
> Robert
> --
>

[-- Attachment #2: Type: text/html, Size: 1705 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-09-29 15:19     ` समीर सिंह Sameer Singh
@ 2022-09-29 15:41       ` Robert Pluim
  0 siblings, 0 replies; 17+ messages in thread
From: Robert Pluim @ 2022-09-29 15:41 UTC (permalink / raw)
  To: समीर सिंह Sameer Singh
  Cc: 58159

>>>>> On Thu, 29 Sep 2022 20:49:10 +0530, समीर सिंह Sameer Singh <lumarzeli30@gmail.com> said:

    >> 
    >> Any reason for the newline between Vietnamese and Wancho?


    समीर> This was not intentional, enriched-mode automatically adds a newline, most
    समीर> of the time I remove it, this time
    समीर> it may have skipped past my eyes.
    समीर> see: https://mail.gnu.org/archive/html/bug-gnu-emacs/2022-05/msg00581.html

OK. I avoid enriched-mode for HELLO :-)

Robert
-- 





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-09-29 11:09 ` समीर सिंह Sameer Singh
  2022-09-29 13:15   ` Eli Zaretskii
  2022-09-29 14:27   ` Robert Pluim
@ 2022-10-01  1:58   ` Richard Stallman
  2022-10-01  4:53     ` समीर सिंह Sameer Singh
  2022-10-01  6:03     ` Eli Zaretskii
  2 siblings, 2 replies; 17+ messages in thread
From: Richard Stallman @ 2022-10-01  1:58 UTC (permalink / raw)
  To: à¤¸à¤®à¥€à¤° à¤¸à¤¿à¤‚à¤¹ Sameer Singh
  Cc: 58159

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

Do we really want to complicate Emacs to support the Wancho script?
According to Wikipedia, the Wancho script was invented 10 years ago;
Wancho is normally written using the Latin alphabet or Devanagari.
Some schools are starting to teach writing Wancho using that alphabet
instead of the well-known alphabet.  I suppose there is a campaign
for Wancho speakers to switch to it.

Is that really a good idea?  I suspect it comes from a sort of
boosterism/ethnic nationalism, as if having your own script were a
mark of importance.  But I think it is counterproductive to introduce
more incompatibility of scripts.

Do we really want to spend time on Emacs supporting scripts
which were created recently and have little user base?

English does not have an alphabet of its own; it uses an alphabet
borrowed from Latin.  Maybe English needs more prestige to compete
with Chinese and Hindi.  Should we invent a new English alphabet?

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-10-01  1:58   ` Richard Stallman
@ 2022-10-01  4:53     ` समीर सिंह Sameer Singh
  2022-10-03  1:06       ` Richard Stallman
  2022-10-01  6:03     ` Eli Zaretskii
  1 sibling, 1 reply; 17+ messages in thread
From: समीर सिंह Sameer Singh @ 2022-10-01  4:53 UTC (permalink / raw)
  To: rms; +Cc: 58159

[-- Attachment #1: Type: text/plain, Size: 4079 bytes --]

>
> Do we really want to complicate Emacs to support the Wancho script?

I don't get how adding support for the Wancho script is complicating Emacs,
this
was a relatively straightforward simple patch, even composition rules were
not needed
here. Wancho is included in Unicode therefore Emacs support is added.

Wancho is normally written using the Latin alphabet or Devanagari.
> Some schools are starting to teach writing Wancho using that alphabet
> instead of the well-known alphabet.  I suppose there is a campaign
> for Wancho speakers to switch to it.
>

Have you considered that Wancho being a Sino-Tibetan language, Devanagari
and Latin script
may be inadequate to serve it?

Is that really a good idea?  I suspect it comes from a sort of
> boosterism/ethnic nationalism, as if having your own script were a
> mark of importance.
>

It is though, having a separate script also provides a unique identity to
the language.
For example take the Bhojpuri language it used to have its own script:
Kaithi, but later switched to
Devanagari, this I feel is one of the major reasons it is still not
recognised as a language by the government
but is instead treated as a dialect of Hindi. Many people regard it as a
"less polished" version of Hindi.
Urdu despite being virtually same with Hindi enjoys the status of a
separate language.
(Of course this also has many different reasons, but a having a different
script is also one of them)

Having a different script has aesthetic reasons as well for example how
could latin replicate the beauty
of devanagari conjuncts!
Also look at the abomination that is the Vietnamese script.

But I think it is counterproductive to introduce
> more incompatibility of scripts.
>

Emacs should atleast support all of the unicode scripts, I don't know how
moving towards that goal is
"increasing incompatibility of scripts"

Do we really want to spend time on Emacs supporting scripts
> which were created recently and have little user base?
>

I do not ask anyone else to spend their time adding scripts to Emacs, since
this is my wish I do it myself,
and the Emacs maintainers graciously accept it  and include it into Emacs
providing corrections and guidance along the way.

English does not have an alphabet of its own; it uses an alphabet
> borrowed from Latin.  Maybe English needs more prestige to compete
> with Chinese and Hindi.  Should we invent a new English alphabet?
>

I propse an Abugida 😉
Maybe this time they could work on the orthography 🤞

On Sat, Oct 1, 2022 at 7:28 AM Richard Stallman <rms@gnu.org> wrote:

> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
> Do we really want to complicate Emacs to support the Wancho script?
> According to Wikipedia, the Wancho script was invented 10 years ago;
> Wancho is normally written using the Latin alphabet or Devanagari.
> Some schools are starting to teach writing Wancho using that alphabet
> instead of the well-known alphabet.  I suppose there is a campaign
> for Wancho speakers to switch to it.
>
> Is that really a good idea?  I suspect it comes from a sort of
> boosterism/ethnic nationalism, as if having your own script were a
> mark of importance.  But I think it is counterproductive to introduce
> more incompatibility of scripts.
>
> Do we really want to spend time on Emacs supporting scripts
> which were created recently and have little user base?
>
> English does not have an alphabet of its own; it uses an alphabet
> borrowed from Latin.  Maybe English needs more prestige to compete
> with Chinese and Hindi.  Should we invent a new English alphabet?
>
> --
> Dr Richard Stallman (https://stallman.org)
> Chief GNUisance of the GNU Project (https://gnu.org)
> Founder, Free Software Foundation (https://fsf.org)
> Internet Hall-of-Famer (https://internethalloffame.org)
>
>
>

[-- Attachment #2: Type: text/html, Size: 5996 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-10-01  4:53     ` समीर सिंह Sameer Singh
@ 2022-10-03  1:06       ` Richard Stallman
  2022-10-03  2:38         ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Stallman @ 2022-10-03  1:06 UTC (permalink / raw)
  To: à¤¸à¤®à¥€à¤° à¤¸à¤¿à¤‚à¤¹ Sameer Singh
  Cc: 58159

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I don't get how adding support for the Wancho script is complicating Emacs,
  > this

Normally a feature like this requires documentation in a manual as
well as code to implement it.

  > Have you considered that Wancho being a Sino-Tibetan language, Devanagari
  > and Latin script
  > may be inadequate to serve it?

It could be so, but there's no point in our speculating about it.  The
Wancho speakers can judge this.  If some decades from now they mostly
use the new alphabet, that will give it a real case for support.

  > It is though, having a separate script also provides a unique identity to
  > the language.

This tends to support my speculation, that the development of this
alphabet was part of a political influence campaign.

  > Urdu despite being virtually same with Hindi enjoys the status of a
  > separate language.

I don't speak either Urdu or Hindi, but I've read that Urdu has a lot
of vocabulary derived from Persian or Arabic.  With such a difference,
they are not "virtually the same."

But that is a tangent.  Each of those scripts is used by millions and
has been used for centuries.  It is clear that Emacs should support
them both.

  > Having a different script has aesthetic reasons as well for example how
  > could latin replicate the beauty
  > of devanagari conjuncts!

I found that a difficult complexity, for this human, and for software
too I expect.  But that too is a tangent.

My point is that when Unicode incorporates scripts that aren't and
never were used very much, and were developed for PR motives,
incorporation into Unicode is not by itself a reason to add support
into Emacs.

You're right that supporting _one_ barely-used script is not a
significant complexity.  If this is the only barely-used script that
Unicode incorporates, I won't keep arguing against it.

But if Unicode is inclined to do things like this, how many more
barely-used scripts will it adopt?  How many more has it already
adopted?

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-10-03  1:06       ` Richard Stallman
@ 2022-10-03  2:38         ` Eli Zaretskii
  2022-10-08 22:35           ` Richard Stallman
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2022-10-03  2:38 UTC (permalink / raw)
  To: rms; +Cc: lumarzeli30, 58159

> Cc: 58159@debbugs.gnu.org
> From: Richard Stallman <rms@gnu.org>
> Date: Sun, 02 Oct 2022 21:06:10 -0400
> 
> But if Unicode is inclined to do things like this, how many more
> barely-used scripts will it adopt?  How many more has it already
> adopted?

That is not our question to answer.  The Unicode Consortium makes
these decisions based on their criteria.  We just support the
characters they add.

These additions are usually so minor that I believe they don't warrant
any discussion.  E.g., part of the support for new characters is the
ability to up-case and down-case them; we lift the data from the
Unicode Character Database, which we import.  It would be unthinkable
for Emacs not to be able to do these simple text operations on any
character.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-10-03  2:38         ` Eli Zaretskii
@ 2022-10-08 22:35           ` Richard Stallman
  2022-10-09  1:08             ` समीर सिंह Sameer Singh
  2022-10-09  4:22             ` Eli Zaretskii
  0 siblings, 2 replies; 17+ messages in thread
From: Richard Stallman @ 2022-10-08 22:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lumarzeli30, 58159

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > But if Unicode is inclined to do things like this, how many more
  > > barely-used scripts will it adopt?  How many more has it already
  > > adopted?

  > That is not our question to answer.

They are questions about the future, so we cannot look for answers
today.  But they do affect what our attitude towards Unicode should
be.


-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)







^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-10-08 22:35           ` Richard Stallman
@ 2022-10-09  1:08             ` समीर सिंह Sameer Singh
  2022-10-14 21:24               ` Richard Stallman
  2022-10-09  4:22             ` Eli Zaretskii
  1 sibling, 1 reply; 17+ messages in thread
From: समीर सिंह Sameer Singh @ 2022-10-09  1:08 UTC (permalink / raw)
  To: rms; +Cc: Eli Zaretskii, 58159

[-- Attachment #1: Type: text/plain, Size: 4500 bytes --]

>
> Normally a feature like this requires documentation in a manual as
> well as code to implement it.
>

Can you elaborate on what changes are needed in which manual?

The code is already implemented i.e. the foundations to support these
scripts are already there,
someone just needs to take their time and extend this support to a specific
script, and I am doing
exactly that. This is nothing more than some grunt work.

This is what a typical patch for adding a script in Emacs looks like:
1. A one line entry in etc/NEWS announcing the support of the script and
its language environment.
2. A one line greeting in the language/script which is added in etc/HELLO
(optional)
3. A one line entry in script-representative-chars in
lisp/international/fontset.el so that Emacs can select an appropriate font
for it.
4. Adding the script name in setup-default-fontset in
lisp/international/fontset.el
5. Defining a language environment for the script in the lisp/language/*.el
files which includes the following entries:
its charset (usually unicode), its coding-system (usually utf-8), its
coding-priority (usually utf-8), its input-method, its sample text (the
same text which is added in etc/HELLO),
a one line documentation usually in the following template: "foo language
and its script bar are supported in this language environment."
6. Adding composition rules for the script (optional, only needed for
complex scripts)
7. Adding an input-method for the script in lisp/leim/quail/*.el files

Adding one of these patches does not mean introducing any significant or
breaking changes.
All the heavy lifting functions or programs were implemented earlier.
We already parse all of the information from unicode so Emacs knows about
these characters,
composite.el and harfbuzz take care of composition and quail takes care of
input-methods.

The average size of my patches appears to be around 126 lines with the
input method and 36 lines without the input-method,
which is a given since input method is needed to be defined for nearly
every key on the keyboard.
I have added around 27 scripts since May of this year.

My point is that when Unicode incorporates scripts that aren't and
> never were used very much, and were developed for PR motives,
> incorporation into Unicode is not by itself a reason to add support
> into Emacs
>

These scripts were not developed for "PR motives", they were developed to
serve the needs of the community.
For example this what was said by the inventor of the Wancho script[1]

> "I found out that it was not possible to translate the language as it did
> not capture all of its sounds. So I started researching on phonetics of the
> language," Losu said.
>

It is necessary for Unicode to support them because this is not the age of
pen and paper where the only thing limiting you to write any script for
communication is... you.
For computers this is not possible therefore efforts should be made to
rectify this both at the Unicode level and the application level.

I don't speak either Urdu or Hindi, but I've read that Urdu has a lot
> of vocabulary derived from Persian or Arabic.  With such a difference,
> they are not "virtually the same."
>

Urdu and Hindi have virtually the same grammar, having some different
vocabulary does not make it
a different language. Hindi and Urdu are regarded as two different
registers of the same language.
see: https://en.wikipedia.org/wiki/Hindustani_language

[1]
https://www.indiatoday.in/education-today/news/story/this-arunachal-student-worked-for-over-12-years-to-create-a-new-alphabet-for-a-dying-ancient-tribal-language-1597122-2019-09-09

On Sun, Oct 9, 2022 at 4:05 AM Richard Stallman <rms@gnu.org> wrote:

> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
>   > > But if Unicode is inclined to do things like this, how many more
>   > > barely-used scripts will it adopt?  How many more has it already
>   > > adopted?
>
>   > That is not our question to answer.
>
> They are questions about the future, so we cannot look for answers
> today.  But they do affect what our attitude towards Unicode should
> be.
>
>
> --
> Dr Richard Stallman (https://stallman.org)
> Chief GNUisance of the GNU Project (https://gnu.org)
> Founder, Free Software Foundation (https://fsf.org)
> Internet Hall-of-Famer (https://internethalloffame.org)
>
>
>

[-- Attachment #2: Type: text/html, Size: 6540 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-10-09  1:08             ` समीर सिंह Sameer Singh
@ 2022-10-14 21:24               ` Richard Stallman
  2022-10-15  6:35                 ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Stallman @ 2022-10-14 21:24 UTC (permalink / raw)
  To: à¤¸à¤®à¥€à¤° à¤¸à¤¿à¤‚à¤¹ Sameer Singh
  Cc: eliz, 58159

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Can you elaborate on what changes are needed in which manual?

I don't know, but normally every new addition calls for documentation
somewhere.

  > This is what a typical patch for adding a script in Emacs looks like:
  > 1. A one line entry in etc/NEWS announcing the support of the script and
  > its language environment.
  > 2. A one line greeting in the language/script which is added in etc/HELLO
  > (optional)
  > 3. A one line entry in script-representative-chars in
  > lisp/international/fontset.el so that Emacs can select an appropriate font
  > for it.
  > 4. Adding the script name in setup-default-fontset in
  > lisp/international/fontset.el
  > 5. Defining a language environment for the script in the lisp/language/*.el
  > files which includes the following entries:
  > its charset (usually unicode), its coding-system (usually utf-8), its
  > coding-priority (usually utf-8), its input-method, its sample text (the
  > same text which is added in etc/HELLO),
  > a one line documentation usually in the following template: "foo language
  > and its script bar are supported in this language environment."
  > 6. Adding composition rules for the script (optional, only needed for
  > complex scripts)
  > 7. Adding an input-method for the script in lisp/leim/quail/*.el files

That looks like nontrivial work to add each script.
Not a big job, but not minimal either.

For a script that users actually want, it is work worth doing.
For a script that we support only because some bureaucrats
decided to include it in Unicode, is it worth that much?

  > These scripts were not developed for "PR motives", they were developed to
  > serve the needs of the community.

What I've read suggests the opposite.  I am not convinced that the
community experienced or experiences such linguistic "needs".  It
looks like some activists in that community decided that using their
own script would help them get political benefits, so they push for
its adoption.

What we know about this is sketchy.  (I could see only fragments of
the article you pointed at -- I suspect nonfree JS blocks the rest.)

If the speakers of a language are really using a script, I am in favor
of supporting it.

  > It is necessary for Unicode to support them because this is not the age of
  > pen and paper where the only thing limiting you to write any script for
  > communication is... you.

I don't subscribe to the idea that we Emacs developers _must_ support
every script that a minority of some speecdh community campaigns to
switch to.  That is dogmatic, and it could impose an unlimited burden
on us.  If every endangered language gets its own script, that could
be almost 200 more scripts coming from India alone.

I am in favor of preserving endangered languages, but that doesn't
usually require inventing a new script for each one.  For instance,
speakers of 22 Maya languages got together and established a rather
natural convention for writing them in the Latin alphabet.  The
convention states how to express each sound used in any of those
languages.  You can find it in Maya Languages in Wikipedia.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-10-14 21:24               ` Richard Stallman
@ 2022-10-15  6:35                 ` Eli Zaretskii
  0 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2022-10-15  6:35 UTC (permalink / raw)
  To: rms; +Cc: lumarzeli30, 58159

> From: Richard Stallman <rms@gnu.org>
> Cc: eliz@gnu.org, 58159@debbugs.gnu.org
> Date: Fri, 14 Oct 2022 17:24:48 -0400
> 
>   > This is what a typical patch for adding a script in Emacs looks like:
>   > 1. A one line entry in etc/NEWS announcing the support of the script and
>   > its language environment.
>   > 2. A one line greeting in the language/script which is added in etc/HELLO
>   > (optional)
>   > 3. A one line entry in script-representative-chars in
>   > lisp/international/fontset.el so that Emacs can select an appropriate font
>   > for it.
>   > 4. Adding the script name in setup-default-fontset in
>   > lisp/international/fontset.el
>   > 5. Defining a language environment for the script in the lisp/language/*.el
>   > files which includes the following entries:
>   > its charset (usually unicode), its coding-system (usually utf-8), its
>   > coding-priority (usually utf-8), its input-method, its sample text (the
>   > same text which is added in etc/HELLO),
>   > a one line documentation usually in the following template: "foo language
>   > and its script bar are supported in this language environment."
>   > 6. Adding composition rules for the script (optional, only needed for
>   > complex scripts)
>   > 7. Adding an input-method for the script in lisp/leim/quail/*.el files
> 
> That looks like nontrivial work to add each script.
> Not a big job, but not minimal either.

Only the two last items are nontrivial.  And item 6 is only necessary
for some scripts.  All the rest is basically trivial boilerplate.

> For a script that users actually want, it is work worth doing.
> For a script that we support only because some bureaucrats
> decided to include it in Unicode, is it worth that much?

We cannot control which itches our contributors want to scratch.
Letting them scratch their itches is an important aspect of being able
to keep them contributing to Emacs in all other areas.  This
particular itch is useful to Emacs, so I see no reason to object their
scratching it.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-10-08 22:35           ` Richard Stallman
  2022-10-09  1:08             ` समीर सिंह Sameer Singh
@ 2022-10-09  4:22             ` Eli Zaretskii
  1 sibling, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2022-10-09  4:22 UTC (permalink / raw)
  To: rms; +Cc: lumarzeli30, 58159

> From: Richard Stallman <rms@gnu.org>
> Cc: lumarzeli30@gmail.com, 58159@debbugs.gnu.org
> Date: Sat, 08 Oct 2022 18:35:36 -0400
> 
>   > > But if Unicode is inclined to do things like this, how many more
>   > > barely-used scripts will it adopt?  How many more has it already
>   > > adopted?
> 
>   > That is not our question to answer.
> 
> They are questions about the future, so we cannot look for answers
> today.  But they do affect what our attitude towards Unicode should
> be.

Emacs supports all the characters defined by Unicode.  This design is
from Emacs 23 onwards.  So any characters Unicode adds will be
supported by Emacs as soon as we import the latest character database.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#58159: [PATCH] Add support for the Wancho script
  2022-10-01  1:58   ` Richard Stallman
  2022-10-01  4:53     ` समीर सिंह Sameer Singh
@ 2022-10-01  6:03     ` Eli Zaretskii
  1 sibling, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2022-10-01  6:03 UTC (permalink / raw)
  To: rms; +Cc: lumarzeli30, 58159

> Cc: 58159@debbugs.gnu.org
> From: Richard Stallman <rms@gnu.org>
> Date: Fri, 30 Sep 2022 21:58:01 -0400
> 
> Do we really want to complicate Emacs to support the Wancho script?

The script was already supported: we automatically add support for all
the scripts defined by the Unicode Standard when we import each new
version of Unicode, simply by virtue of supporting the character
codepoints of that script.

The change in question just defined a new language-environment (a
small addition to an existing data structure) and a new (and very
simple) input method.  So my conclusion (and I asked myself the same
questions when reviewing the patch) was that adding this doesn't
complicate Emacs in any significant way.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-10-15  6:35 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-29 11:07 bug#58159: [PATCH] Add support for the Wancho script समीर सिंह Sameer Singh
2022-09-29 11:09 ` समीर सिंह Sameer Singh
2022-09-29 13:15   ` Eli Zaretskii
2022-09-29 13:21     ` समीर सिंह Sameer Singh
2022-09-29 14:27   ` Robert Pluim
2022-09-29 15:19     ` समीर सिंह Sameer Singh
2022-09-29 15:41       ` Robert Pluim
2022-10-01  1:58   ` Richard Stallman
2022-10-01  4:53     ` समीर सिंह Sameer Singh
2022-10-03  1:06       ` Richard Stallman
2022-10-03  2:38         ` Eli Zaretskii
2022-10-08 22:35           ` Richard Stallman
2022-10-09  1:08             ` समीर सिंह Sameer Singh
2022-10-14 21:24               ` Richard Stallman
2022-10-15  6:35                 ` Eli Zaretskii
2022-10-09  4:22             ` Eli Zaretskii
2022-10-01  6:03     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).