From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?=E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0_?= =?UTF-8?Q?=E0=A4=B8=E0=A4=BF=E0=A4=82=E0=A4=B9?= Sameer Singh Newsgroups: gmane.emacs.bugs Subject: bug#58159: [PATCH] Add support for the Wancho script Date: Sun, 9 Oct 2022 06:38:53 +0530 Message-ID: References: <83bkqtzl6d.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000076174905ea8fb054" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26349"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , 58159@debbugs.gnu.org To: rms@gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Oct 09 03:10:13 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ohKpZ-0006eg-A3 for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 09 Oct 2022 03:10:13 +0200 Original-Received: from localhost ([::1]:35810 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ohKpY-0008Py-5u for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 08 Oct 2022 21:10:12 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52650) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ohKpO-0008PX-H5 for bug-gnu-emacs@gnu.org; Sat, 08 Oct 2022 21:10:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:42584) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ohKpO-00034s-8d for bug-gnu-emacs@gnu.org; Sat, 08 Oct 2022 21:10:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ohKpN-00007V-WC for bug-gnu-emacs@gnu.org; Sat, 08 Oct 2022 21:10:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?=E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0_?= =?UTF-8?Q?=E0=A4=B8=E0=A4=BF=E0=A4=82=E0=A4=B9?= Sameer Singh Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 09 Oct 2022 01:10:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58159 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 58159-submit@debbugs.gnu.org id=B58159.1665277756392 (code B ref 58159); Sun, 09 Oct 2022 01:10:01 +0000 Original-Received: (at 58159) by debbugs.gnu.org; 9 Oct 2022 01:09:16 +0000 Original-Received: from localhost ([127.0.0.1]:41662 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ohKod-00006G-4E for submit@debbugs.gnu.org; Sat, 08 Oct 2022 21:09:15 -0400 Original-Received: from mail-yw1-f180.google.com ([209.85.128.180]:37758) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ohKoZ-000060-2V for 58159@debbugs.gnu.org; Sat, 08 Oct 2022 21:09:13 -0400 Original-Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-35ceeae764dso74217937b3.4 for <58159@debbugs.gnu.org>; Sat, 08 Oct 2022 18:09:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=d2gIseczoPkMVo2k6lYH9Q1kBhki+Fhtb4zPSKV6xGA=; b=WnK1Df0yeXufgSURO31KI2a4Rta+LRB9ojQJZVe77mgiIwYhDtAFCU1Yv/m2xUx/wY ncs+S0qaVdtSTzq8oHZo4xb9z/QVT8B5STKmLL7YzXAzEm9vMxw7Eq6WO/jxjc6mWD9W O39QM5cgX99UjDb/IHtCdkScRkTLBKH0mUDZgWK0Oc/+zqcRbJ3edi77IyHI3fQDubgH Iz+9YxRJK/nXkhJsws6NilL8PU2iH08mKYvgFqNU+ofuLzfpiu2dCdFBaIR99XwxDY1K szz+Qy/Om2xa9fJGwakd4y7gFPtt6XbnffX/QSUWGO9yd42L3oM4JRgezV6wzCxz2grW 9k6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=d2gIseczoPkMVo2k6lYH9Q1kBhki+Fhtb4zPSKV6xGA=; b=DKhXnR9u2Wm2SljXBykyLmMj6m5s4D1VdfJ6J3zlwnZnsdhLQgLGq9y7k5ev29r6Go DT1PMh0S+eEzcZDkmON9T2oGgqTOlfqGnzbL21xhd8n2s5mM/2vIHLQKakSZYsyMVCBL 4YwSgFeR+pjCDu82mKDX+54pW2vFLgT+zi5nnmydNzNMeN68a+WTJU+Nogb99lNAhIGL TRBP2xkXD0ORdPCf2YJPWktxf88jUDCgFlSjYXihOBo0xBtJdlYohPXoKXSwXHFfFHNo 7G6HH3DdMdaf5gSHKN/VJGQlXblxwPGKA9emNiZt12i2yr3tR3cFPnd5/OVixf0oaQ7m 43Zw== X-Gm-Message-State: ACrzQf1V/DtdqGwDeiwFKhpiATmOPjt6Lm9nVPXRla+W//dxg/4+Ieae jrdI+NgXF/3s7NTlk6i896U+wvAkl8kp00wEg9Ge0F2FDjUB8A== X-Google-Smtp-Source: AMsMyM6tCf5CGO0d5ajrlEfUGgyORwKB9bJ4vMCH6cyIB0DNOu1W42ikymB/DCkSec+bIN28UUPv4KzVBtn1U/Qz5nk= X-Received: by 2002:a81:d34c:0:b0:349:1e37:ce4e with SMTP id d12-20020a81d34c000000b003491e37ce4emr10799805ywl.112.1665277745304; Sat, 08 Oct 2022 18:09:05 -0700 (PDT) In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:244943 Archived-At: --00000000000076174905ea8fb054 Content-Type: text/plain; charset="UTF-8" > > Normally a feature like this requires documentation in a manual as > well as code to implement it. > Can you elaborate on what changes are needed in which manual? The code is already implemented i.e. the foundations to support these scripts are already there, someone just needs to take their time and extend this support to a specific script, and I am doing exactly that. This is nothing more than some grunt work. This is what a typical patch for adding a script in Emacs looks like: 1. A one line entry in etc/NEWS announcing the support of the script and its language environment. 2. A one line greeting in the language/script which is added in etc/HELLO (optional) 3. A one line entry in script-representative-chars in lisp/international/fontset.el so that Emacs can select an appropriate font for it. 4. Adding the script name in setup-default-fontset in lisp/international/fontset.el 5. Defining a language environment for the script in the lisp/language/*.el files which includes the following entries: its charset (usually unicode), its coding-system (usually utf-8), its coding-priority (usually utf-8), its input-method, its sample text (the same text which is added in etc/HELLO), a one line documentation usually in the following template: "foo language and its script bar are supported in this language environment." 6. Adding composition rules for the script (optional, only needed for complex scripts) 7. Adding an input-method for the script in lisp/leim/quail/*.el files Adding one of these patches does not mean introducing any significant or breaking changes. All the heavy lifting functions or programs were implemented earlier. We already parse all of the information from unicode so Emacs knows about these characters, composite.el and harfbuzz take care of composition and quail takes care of input-methods. The average size of my patches appears to be around 126 lines with the input method and 36 lines without the input-method, which is a given since input method is needed to be defined for nearly every key on the keyboard. I have added around 27 scripts since May of this year. My point is that when Unicode incorporates scripts that aren't and > never were used very much, and were developed for PR motives, > incorporation into Unicode is not by itself a reason to add support > into Emacs > These scripts were not developed for "PR motives", they were developed to serve the needs of the community. For example this what was said by the inventor of the Wancho script[1] > "I found out that it was not possible to translate the language as it did > not capture all of its sounds. So I started researching on phonetics of the > language," Losu said. > It is necessary for Unicode to support them because this is not the age of pen and paper where the only thing limiting you to write any script for communication is... you. For computers this is not possible therefore efforts should be made to rectify this both at the Unicode level and the application level. I don't speak either Urdu or Hindi, but I've read that Urdu has a lot > of vocabulary derived from Persian or Arabic. With such a difference, > they are not "virtually the same." > Urdu and Hindi have virtually the same grammar, having some different vocabulary does not make it a different language. Hindi and Urdu are regarded as two different registers of the same language. see: https://en.wikipedia.org/wiki/Hindustani_language [1] https://www.indiatoday.in/education-today/news/story/this-arunachal-student-worked-for-over-12-years-to-create-a-new-alphabet-for-a-dying-ancient-tribal-language-1597122-2019-09-09 On Sun, Oct 9, 2022 at 4:05 AM Richard Stallman wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > > But if Unicode is inclined to do things like this, how many more > > > barely-used scripts will it adopt? How many more has it already > > > adopted? > > > That is not our question to answer. > > They are questions about the future, so we cannot look for answers > today. But they do affect what our attitude towards Unicode should > be. > > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > --00000000000076174905ea8fb054 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Nor= mally a feature like this requires documentation in a manual as
well as code to implement it.
=C2=A0
Can you elaborate on what changes are needed= in which manual?

The code is already implemented = i.e. the foundations to support these scripts are already there,
= someone just needs to take their time and extend this support to a specific= script, and I am doing
exactly that. This is nothing more than s= ome grunt work.

This is what a typical patch f= or adding a script in Emacs looks like:
1. A one line entry in et= c/NEWS announcing the support of the script and its language environment.
2. A one line greeting in the language/script which is added in et= c/HELLO (optional)
3. A one line entry in script-representative-c= hars in lisp/international/fontset.el so that Emacs can select an appropria= te font for it.
4. Adding the script name in setup-default-fontse= t in lisp/international/fontset.el
5. Defining a language environ= ment for the script in the lisp/language/*.el files which includes the foll= owing entries:
its charset (usually unicode), its coding-system (= usually utf-8), its coding-priority (usually utf-8), its input-method, its = sample text (the same text which is added in etc/HELLO),
a one li= ne documentation usually in the following template: "foo language and = its script bar are supported in this language environment."
= 6. Adding composition rules for the script (optional, only needed for compl= ex scripts)
7. Adding an input-method for the script in lisp/= leim/quail/*.el files

Adding one of these patches = does not mean introducing any significant or breaking changes.
Al= l the heavy lifting functions or programs were implemented earlier.
We already parse all of the information from unicode so Emacs knows abou= t these characters,
composite.el and harfbuzz take care of co= mposition and quail takes care of input-methods.

T= he average size of my patches appears to be around 126 lines with the input= method and 36 lines without the input-method,
which is a given s= ince input method is needed to be defined for nearly every key on the keybo= ard.
I have added around 27 scripts since May of this year.
=

My po= int is that when Unicode incorporates scripts that aren't and
never were used very much, and were developed for PR motives,
incorporation into Unicode is not by itself a reason to add support
into Emacs

These scripts were not dev= eloped for "PR motives", they were developed to serve the needs o= f the community.
For example this what was said by the invent= or of the Wancho script[1]
"I found out that it was not possible to translate the lan= guage as it=20 did not capture all of its sounds. So I started researching on phonetics of the language," Losu said.

It= is necessary for Unicode to support them because this is not the age of pe= n and paper where the only thing limiting you to write any script for commu= nication is... you.
For computers this is not possible therefore = efforts should be made to rectify this both at the Unicode level and the ap= plication level.

I don't speak either Urdu or Hindi, but I've read = that Urdu has a lot
of vocabulary derived from Persian or Arabic.=C2=A0 With such a difference,=
they are not "virtually the same."

Urdu and Hindi have virtually the same grammar, having some differe= nt vocabulary does not make it
a different language. Hindi and Ur= du are regarded as two different registers of the same language.


[[[ To any NSA and FBI agents reading my e= mail: please consider=C2=A0 =C2=A0 ]]]
[[[ whether defending the US Constitution against all enemies,=C2=A0 =C2=A0= =C2=A0]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]<= br>
=C2=A0 > > But if Unicode is inclined to do things like this, how man= y more
=C2=A0 > > barely-used scripts will it adopt?=C2=A0 How many more has= it already
=C2=A0 > > adopted?

=C2=A0 > That is not our question to answer.

They are questions about the future, so we cannot look for answers
today.=C2=A0 But they do affect what our attitude towards Unicode should be.


--
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)


--00000000000076174905ea8fb054--