From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#55370: [PATCH] Add support for the Syloti Nagri script Date: Thu, 12 May 2022 19:29:23 +0300 Message-ID: <837d6qpvdo.fsf@gnu.org> References: <83wnerp6p0.fsf@gnu.org> <83bkw2q28v.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8937"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 55370@debbugs.gnu.org To: =?UTF-8?Q?=E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0_?= =?UTF-8?Q?=E0=A4=B8=E0=A4=BF=E0=A4=82=E0=A4=B9?= Sameer Singh Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu May 12 18:40:09 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1npBrD-00029p-RF for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 12 May 2022 18:40:08 +0200 Original-Received: from localhost ([::1]:47806 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1npBrC-0003bN-RO for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 12 May 2022 12:40:06 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52620) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1npBhU-0002R8-3E for bug-gnu-emacs@gnu.org; Thu, 12 May 2022 12:30:10 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:48173) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1npBhT-0004Dx-L7 for bug-gnu-emacs@gnu.org; Thu, 12 May 2022 12:30:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1npBhT-00081b-E6 for bug-gnu-emacs@gnu.org; Thu, 12 May 2022 12:30:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 12 May 2022 16:30:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 55370 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 55370-submit@debbugs.gnu.org id=B55370.165237297830776 (code B ref 55370); Thu, 12 May 2022 16:30:03 +0000 Original-Received: (at 55370) by debbugs.gnu.org; 12 May 2022 16:29:38 +0000 Original-Received: from localhost ([127.0.0.1]:42069 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1npBgu-000807-7G for submit@debbugs.gnu.org; Thu, 12 May 2022 12:29:37 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:59196) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1npBgs-0007zu-KH for 55370@debbugs.gnu.org; Thu, 12 May 2022 12:29:26 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:40242) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1npBgn-0003yR-B1; Thu, 12 May 2022 12:29:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=disJsZl2KUrpFUeMXkWCQUBbRlcS0KBUKOy2Z0YFj/M=; b=pERLZvO/KEx3JnxTEduZ CsFEXwo1VMlWsVgekYe04hLMcjbNWzruhFU+Dlpw/2Xmz0PpvhOc5eurKxZ8mIzGrh5LQzlmDucOV cA1KmgGMadnmai1fWylS4P35tEeLB7tKZrOeHaeo9h6Np/ziOyWvqtKiigX/C7jT0KmqeNQ71nqCM w3nA5Goa3EETOM4rncCtJE22MghxOxZTx20aq2pahqZZioct1Ku+9ROKhmX7RYEPKgFXO7on1G5j9 VHwl++ELFcQ4lhfAmKkUtjD3hYbbPiVC6nOcTIxpnmiLEa9d9ww8F8u7935Cvl/mmu9hfoQR4lEez GOKAbUxhqhLCUQ==; Original-Received: from [87.69.77.57] (port=1560 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1npBgl-0007us-Kk; Thu, 12 May 2022 12:29:21 -0400 In-Reply-To: (message from =?UTF-8?Q?=E0=A4=B8=E0=A4=AE=E0=A5=80=E0=A4=B0_?= =?UTF-8?Q?=E0=A4=B8=E0=A4=BF=E0=A4=82=E0=A4=B9?= Sameer Singh on Thu, 12 May 2022 20:36:49 +0530) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:232064 Archived-At: > From: समीर सिंह Sameer Singh > Date: Thu, 12 May 2022 20:36:49 +0530 > Cc: 55370@debbugs.gnu.org > > For example in tirhuta, when I do this: > > ;; Tirhuta composition rules > (let ((consonant "[\x1148F-\x114AF]") > (nukta "\x114C3") > (independent-vowel "[\x11481-\x1148E]") > (vowel "[\x114B0-\x114BE]") > (nasal "[\x114BF\x114C0]") > (virama "\x114C2")) > (set-char-table-range composition-function-table > '(#x114B0 . #x114BE) > (list (vector > ;; Consonant based syllables > (concat consonant nukta "?\\(?:" virama > consonant nukta "?\\)*\\(?:" > virama "\\|" vowel "*" nukta "?" > nasal "?\\)") > 1 'font-shape-gstring)))) > > Notice here, the nasal sign is not included in the range. > And then I type: 𑒅𑓀 𑒆𑒿 > It is rendered correctly It is rendered correctly because your rule isn't used. The rule '(#x114B0 . #x114BE) (list (vector ;; Consonant based syllables (concat consonant nukta "?\\(?:" virama consonant nukta "?\\)* \\(?:" virama "\\|" vowel "*" nukta "?" nasal "?\\)") 1 'font-shape-gstring)))) says this: . find a character C between #x114B0 and #x114BE . see if the characters starting one character before C match the above regexp . if they match, compose them But your text doesn't include any characters in the range [\x114B0-\x114BE], so the above rule will never match anything, and will not cause any composition. You see the characters composed because the second character in each par, #x114C0 and #x114BF, is a combining accent, and for those we have a catch-all rule in composite.el: (when unicode-category-table (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic] [nil 0 compose-gstring-for-graphic]))) (map-char-table #'(lambda (key val) (if (memq val '(Mn Mc Me)) (set-char-table-range composition-function-table key elt))) unicode-category-table)) > But when I do: > > ;; Tirhuta composition rules > (let ((consonant "[\x1148F-\x114AF]") > (nukta "\x114C3") > (independent-vowel "[\x11481-\x1148E]") > (vowel "[\x114B0-\x114BE]") > (nasal "[\x114BF\x114C0]") > (virama "\x114C2")) > (set-char-table-range composition-function-table > '(#x114B0 . #x114C0) > (list (vector > ;; Consonant based syllables > (concat consonant nukta "?\\(?:" virama > consonant nukta "?\\)*\\(?:" > virama "\\|" vowel "*" nukta "?" > nasal "?\\)") > 1 'font-shape-gstring)))) > The range now has the nasal signs. > And then type the above characters: 𑒅𑓀 𑒆𑒿 > They are not rendered correctly In this case, the characters that trigger examination of the composition rules, #x114C0 and #x114BF, _are_ in the range '(#x114B0 . #x114C0). However, the preceding characters, #x11484 and #x11486, are independent-vowel's, and there are no independent-vowel in the regexp. So again, the rules will never match. Except that now you also replaced the default rule we have for the combining accents, so what worked before no longer does. > But when I include their composition rules: > > ;; Tirhuta composition rules > (let ((consonant "[\x1148F-\x114AF]") > (nukta "\x114C3") > (independent-vowel "[\x11481-\x1148E]") > (vowel "[\x114B0-\x114BE]") > (nasal "[\x114BF\x114C0]") > (virama "\x114C2")) > (set-char-table-range composition-function-table > '(#x114B0 . #x114C0) > (list (vector > ;; Consonant based syllables > (concat consonant nukta "?\\(?:" virama > consonant nukta "?\\)*\\(?:" > virama "\\|" vowel "*" nukta "?" > nasal "?\\)") > 1 'font-shape-gstring) > (vector > ;; Nasal vowels > (concat independent-vowel nasal "?") > 1 'font-shape-gstring)))) > > They are now once more rendered correctly. As expected, see above: now you do have a regexp that can match, it's this one: (concat independent-vowel nasal "?") I hope you now understand how to fix the rules. If not, please ask more questions and show more examples.