From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?=E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0_?= =?UTF-8?Q?=E0=A4=B8=E0=A4=BF=E0=A4=82=E0=A4=B9?= Sameer Singh Newsgroups: gmane.emacs.bugs Subject: bug#55370: [PATCH] Add support for the Syloti Nagri script Date: Thu, 12 May 2022 19:12:09 +0530 Message-ID: References: <83wnerp6p0.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000849fea05ded0bb58" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14813"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 55370@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu May 12 15:43:37 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1np96P-0003hU-KB for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 12 May 2022 15:43:37 +0200 Original-Received: from localhost ([::1]:48478 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1np96O-0000K5-6B for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 12 May 2022 09:43:36 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:38284) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1np95q-00083p-BH for bug-gnu-emacs@gnu.org; Thu, 12 May 2022 09:43:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:46117) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1np95q-0001vz-29 for bug-gnu-emacs@gnu.org; Thu, 12 May 2022 09:43:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1np95p-0002jz-SI for bug-gnu-emacs@gnu.org; Thu, 12 May 2022 09:43:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?=E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0_?= =?UTF-8?Q?=E0=A4=B8=E0=A4=BF=E0=A4=82=E0=A4=B9?= Sameer Singh Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 12 May 2022 13:43:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 55370 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 55370-submit@debbugs.gnu.org id=B55370.165236297210513 (code B ref 55370); Thu, 12 May 2022 13:43:01 +0000 Original-Received: (at 55370) by debbugs.gnu.org; 12 May 2022 13:42:52 +0000 Original-Received: from localhost ([127.0.0.1]:40014 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1np95f-0002jV-Tj for submit@debbugs.gnu.org; Thu, 12 May 2022 09:42:52 -0400 Original-Received: from mail-qt1-f178.google.com ([209.85.160.178]:37715) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1np95e-0002jG-1q for 55370@debbugs.gnu.org; Thu, 12 May 2022 09:42:50 -0400 Original-Received: by mail-qt1-f178.google.com with SMTP id h3so4309001qtn.4 for <55370@debbugs.gnu.org>; Thu, 12 May 2022 06:42:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=8PgiDl0y+mouVjmRFLyfUBbhpdY6p+QCEaABA+ZVuuI=; b=VVEoLji+6IvTd+GqNO1hi50EPww1G9+gnsCqMH2CpcnH3H61q+eQDJskMl+a9paYoM 3jlg2ivEfzY+PXkiOGLGy7Wry36fCXlouz4M19mJTpdJh4DXqahM1V/h7g2/zAAgqqlW F4UCx3NvDSRgX9BvsT6X/9UOIP/V7h1IQWtXMaSmEyHjsAMNciMqEk4JKbpRRxamgG3N HUDy4quM05iVPB/AVPoAKI7F0RAZeWbH5o1yTwki2Sj5sZHZHWEJejmvXfqsDDOYuN2N vKSv3M5zAzZ70r+hxshQxbRn19ZNQ4eV1Wv77DRF9RSZIQVy4o8eFSnvFR5e051zbtDK 2T5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=8PgiDl0y+mouVjmRFLyfUBbhpdY6p+QCEaABA+ZVuuI=; b=bFhBJ6deNuA4/jiCXEd3p2jDi/8eZG9rjToiQY5AmbABz55AYCwgsDQhvOV9KlFyUc UG+GjRTbCY/yrFEkZFrkOsina5+0Npqm1o+ueIg9+wF6lzSZPGTVe8Ue41s0gEO1bsRK gjNahf92Rhz0HJdBR7Wa8ch8Xv3JRiJQTUyGeasf7mtirwvaodfpDrRmVo6SOYE3k+cI Psyse0iZDEFL+6nteC99OIrEgvfnRLfYBO/xXRaNoxHJrxThsZBbM+w3my07bF0inZLo tO+VDuoN7J9mefEV2mMQFk0CdW+PS87UUFXqg50bDhADhv6OPs8sYXxHLgpY6yqffrFj FjaQ== X-Gm-Message-State: AOAM530hL3Q1lUDPsiWE+RnX6StUIRO2HzCFlqmkNT8RdtO7WLrZS7zh JHRb8+AZrIAxyxUr4AY5H+VdvRZy3tTlvqhz6X6epuykCnMNZQ== X-Google-Smtp-Source: ABdhPJwIGJs8aOB45dUmdMyXoOkiFl/+22cS/1KTq+MAkF3mAHHgIb6IBEd/uA5+Lw/+vJQqvMD7CLWEqF+57SRgRKg= X-Received: by 2002:a05:622a:1010:b0:2f3:d8f4:6cb4 with SMTP id d16-20020a05622a101000b002f3d8f46cb4mr18606925qte.180.1652362964146; Thu, 12 May 2022 06:42:44 -0700 (PDT) In-Reply-To: <83wnerp6p0.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:232050 Archived-At: --000000000000849fea05ded0bb58 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you for reviewing the patch. I have noticed that when there is no nasal sign in the range of the set-char-table-range function, it is rendered correctly when alone with a consonant or an independent vowel. But when it is added to the range, it is not displayed correctly, until and unless a composition rule is added for it. Sometimes for scripts like Syloti Nagri, Sharada and Kaithi these signs are not in a contiguous range with virama and vowel signs (they are far away) So when I add them to the range, Emacs starts to hang. (Maybe because the range is too big, or there are unnecessary symbols like consonants there) This is why I had decided to not include them, because they were still rendering fine. So should I leave them as it is, or make another set-char-table-range that includes only them? Similarly here: this rule will never match if 'vowel' isn't present, > because the second character of the matching sequence _must_ be a > vowel, since that is what triggers the composition rule in the first > place. Am I missing something? > Here too since consonant vowel nasal was not rendering I added the rule, maybe I should remove the "?" after vowel. (consonant nasal was rendering fine) On Thu, May 12, 2022 at 12:40 PM Eli Zaretskii wrote: > > From: =E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0 =E0=A4=B8=E0=A4=BF=E0=A4=82= =E0=A4=B9 Sameer Singh > > > > Date: Wed, 11 May 2022 20:31:28 +0530 > > > > This time I have added support for the Syloti Nagri script. > > I also had to separate the consonant conjunct syllables and the non > consonant conjunct syllables > > composition rules this time around, because if they were together, Emac= s > would hang whenever I put a > > cursor on a Syloti Nagri word or tried to edit it. > > Thanks. > > There's something strange in the composition rules: > > > +;; Syloti Nagri composition rules > > +(let ((consonant "[\xA807-\xA80A\xA80C-\xA822]") > > + (independent-vowel "[\xA800\xA801\xA803-\xA805]") > > + (vowel "[\xA802\xA823-\xA827]") > > + (nasal "[\xA80B]") > > + (virama "[\xA806\xA82C]")) > > + (set-char-table-range composition-function-table > > + '(#xA806 . #xA806) > > + (list (vector > > + ;; Consonant conjunct based syllables > > + (concat consonant "\\(?:" virama > consonant "\\)+" > > + vowel "?" nasal "?") > > + 1 'font-shape-gstring) > > + (vector > > + ;; Nasal vowels > > + (concat independent-vowel nasal "?") > > + 1 'font-shape-gstring))) > > This set of ruled is triggered by U+A806, and should match a regexp > starting from one character before U+A806. However, the second rule, > i.e. > > > + ;; Nasal vowels > > + (concat independent-vowel nasal "?") > > + 1 'font-shape-gstring))) > > has 'nasal' ("[\xA80B]") as its second character, and 'nasal' will > never match U+A806. So this rule will never match, right? > > > + (set-char-table-range composition-function-table > > + '(#xA823 . #xA827) > > + (list (vector > > + ;; Non Consonant conjunct based syllabl= es > > + (concat consonant vowel "?" nasal "?") > > + 1 'font-shape-gstring)))) > > Similarly here: this rule will never match if 'vowel' isn't present, > because the second character of the matching sequence _must_ be a > vowel, since that is what triggers the composition rule in the first > place. Am I missing something? > > I see similar issues with the composition rules we installed for other > old Indian scripts; could you please review them with the above > comments in mind and see which ones need to be amended? > --000000000000849fea05ded0bb58 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you for reviewing the patch.

I have noticed that when there is no nasal sign in = the range of the set-char-table-range function, it is rendered correctly wh= en alone with a consonant or an independent vowel.
But when i= t is added to the range, it is not displayed correctly, until and unless a = composition rule is added for it.

Sometimes for sc= ripts like Syloti Nagri, Sharada and Kaithi these signs are not in a contig= uous range with virama and vowel signs (they are far away)
So whe= n I add them to the range, Emacs starts to hang. (Maybe because the range i= s too big, or there are unnecessary symbols like consonants there)
This is why I had decided to not include them, because they were still re= ndering fine.

So should I leave them as it is,= or make another set-char-table-range that includes only them?
Similarly he= re: this rule will never match if 'vowel' isn't present,
because the second character of the matching sequence _must_ be a
vowel, since that is what triggers the composition rule in the first
place.=C2=A0 Am I missing something?

= Here too since consonant vowel nasal was not rendering I added the rule, ma= ybe I should remove the "?" after vowel.
(consonant nas= al was rendering fine)

On Thu, May 12, 2022 at 12:40 PM Eli Zaret= skii <eliz@gnu.org> wrote:
> From: =E0=A4=B8= =E0=A4=AE=E0=A5=80=E0=A4=B0 =E0=A4=B8=E0=A4=BF=E0=A4=82=E0=A4=B9 Sameer Sin= gh
>=C2=A0 <l= umarzeli30@gmail.com>
> Date: Wed, 11 May 2022 20:31:28 +0530
>
> This time I have added support for the Syloti Nagri script.
> I also had to separate the consonant conjunct syllables and the non co= nsonant conjunct syllables
> composition rules this time around, because if they were together, Ema= cs would hang whenever I put a
> cursor on a Syloti Nagri word or tried to edit it.

Thanks.

There's something strange in the composition rules:

> +;; Syloti Nagri composition rules
> +(let ((consonant=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "[\xA8= 07-\xA80A\xA80C-\xA822]")
> +=C2=A0 =C2=A0 =C2=A0 (independent-vowel=C2=A0 =C2=A0 "[\xA800\xA= 801\xA803-\xA805]")
> +=C2=A0 =C2=A0 =C2=A0 (vowel=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 "[\xA802\xA823-\xA827]")
> +=C2=A0 =C2=A0 =C2=A0 (nasal=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 "[\xA80B]")
> +=C2=A0 =C2=A0 =C2=A0 (virama=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0"[\xA806\xA82C]"))
> +=C2=A0 (set-char-table-range composition-function-table
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 '(#xA806 . #xA806)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 (list (vector
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0;; Consonant conjunct based sylla= bles
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(concat consonant "\\(?:&quo= t; virama consonant "\\)+"
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vowel= "?" nasal "?")
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01 'font-shape-gstring)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (vector
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0;; Nasal vowels
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(concat independent-vowel nasal &= quot;?")
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01 'font-shape-gstring)))

This set of ruled is triggered by U+A806, and should match a regexp
starting from one character before U+A806.=C2=A0 However, the second rule,<= br> i.e.

> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0;; Nasal vowels
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(concat independent-vowel nasal &= quot;?")
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01 'font-shape-gstring)))

has 'nasal' ("[\xA80B]") as its second character, and = 9;nasal' will
never match U+A806.=C2=A0 So this rule will never match, right?

> +=C2=A0 (set-char-table-range composition-function-table
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 '(#xA823 . #xA827)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 (list (vector
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0;; Non Consonant conjunct based s= yllables
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(concat consonant vowel "?&q= uot; nasal "?")
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01 'font-shape-gstring))))

Similarly here: this rule will never match if 'vowel' isn't pre= sent,
because the second character of the matching sequence _must_ be a
vowel, since that is what triggers the composition rule in the first
place.=C2=A0 Am I missing something?

I see similar issues with the composition rules we installed for other
old Indian scripts; could you please review them with the above
comments in mind and see which ones need to be amended?
--000000000000849fea05ded0bb58--