unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: "समीर सिंह Sameer Singh" <lumarzeli30@gmail.com>
Cc: 55370@debbugs.gnu.org
Subject: bug#55370: [PATCH] Add support for the Syloti Nagri script
Date: Thu, 12 May 2022 19:29:23 +0300	[thread overview]
Message-ID: <837d6qpvdo.fsf@gnu.org> (raw)
In-Reply-To: <CAOR1sLx6snSs6a4D0KNpR7ov-Q4rR8HXSS-8MVLFvxsZFSCN+A@mail.gmail.com> (message from समीर सिंह Sameer Singh on Thu, 12 May 2022 20:36:49 +0530)

> From: समीर सिंह Sameer Singh <lumarzeli30@gmail.com>
> Date: Thu, 12 May 2022 20:36:49 +0530
> Cc: 55370@debbugs.gnu.org
> 
> For example in tirhuta, when I do this:
> 
> ;; Tirhuta composition rules
> (let ((consonant            "[\x1148F-\x114AF]")
>       (nukta                "\x114C3")
>       (independent-vowel    "[\x11481-\x1148E]")
>       (vowel                "[\x114B0-\x114BE]")
>       (nasal                "[\x114BF\x114C0]")
>       (virama               "\x114C2"))
>   (set-char-table-range composition-function-table
>                         '(#x114B0 . #x114BE)
>                         (list (vector
>                                ;; Consonant based syllables
>                                (concat consonant nukta "?\\(?:" virama
> consonant nukta "?\\)*\\(?:"
>                                        virama "\\|" vowel "*" nukta "?"
> nasal "?\\)")
>                                1 'font-shape-gstring))))
> 
> Notice here, the nasal sign is not included in the range.
> And then I type: 𑒅𑓀 𑒆𑒿
> It is rendered correctly

It is rendered correctly because your rule isn't used.

The rule

                        '(#x114B0 . #x114BE)
                        (list (vector
                               ;; Consonant based syllables
                               (concat consonant nukta "?\\(?:"
			               virama consonant nukta "?\\)* \\(?:"
                                       virama "\\|" vowel "*" nukta "?"
                                       nasal "?\\)")
                               1 'font-shape-gstring))))

says this:

  . find a character C between #x114B0 and #x114BE
  . see if the characters starting one character before C match the
    above regexp
  . if they match, compose them

But your text doesn't include any characters in the range
[\x114B0-\x114BE], so the above rule will never match anything, and
will not cause any composition.

You see the characters composed because the second character in each
par, #x114C0 and #x114BF, is a combining accent, and for those we have
a catch-all rule in composite.el:

  (when unicode-category-table
    (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
		 [nil 0 compose-gstring-for-graphic])))
      (map-char-table
       #'(lambda (key val)
	   (if (memq val '(Mn Mc Me))
	       (set-char-table-range composition-function-table key elt)))
       unicode-category-table))


> But when I do:
> 
> ;; Tirhuta composition rules
> (let ((consonant            "[\x1148F-\x114AF]")
>       (nukta                "\x114C3")
>       (independent-vowel    "[\x11481-\x1148E]")
>       (vowel                "[\x114B0-\x114BE]")
>       (nasal                "[\x114BF\x114C0]")
>       (virama               "\x114C2"))
>   (set-char-table-range composition-function-table
>                         '(#x114B0 . #x114C0)
>                         (list (vector
>                                ;; Consonant based syllables
>                                (concat consonant nukta "?\\(?:" virama
> consonant nukta "?\\)*\\(?:"
>                                        virama "\\|" vowel "*" nukta "?"
> nasal "?\\)")
>                                1 'font-shape-gstring))))
> The range now has the nasal signs.
> And then type the above characters: 𑒅𑓀 𑒆𑒿
> They are not rendered correctly

In this case, the characters that trigger examination of the
composition rules, #x114C0 and #x114BF, _are_ in the range
'(#x114B0 . #x114C0).  However, the preceding characters, #x11484 and
#x11486, are independent-vowel's, and there are no independent-vowel
in the regexp.  So again, the rules will never match.  Except that now
you also replaced the default rule we have for the combining accents,
so what worked before no longer does.

> But when I include their composition rules:
> 
> ;; Tirhuta composition rules
> (let ((consonant            "[\x1148F-\x114AF]")
>       (nukta                "\x114C3")
>       (independent-vowel    "[\x11481-\x1148E]")
>       (vowel                "[\x114B0-\x114BE]")
>       (nasal                "[\x114BF\x114C0]")
>       (virama               "\x114C2"))
>   (set-char-table-range composition-function-table
>                         '(#x114B0 . #x114C0)
>                         (list (vector
>                                ;; Consonant based syllables
>                                (concat consonant nukta "?\\(?:" virama
> consonant nukta "?\\)*\\(?:"
>                                        virama "\\|" vowel "*" nukta "?"
> nasal "?\\)")
>                                1 'font-shape-gstring)
>                               (vector
>                                ;; Nasal vowels
>                                (concat independent-vowel nasal "?")
>                                1 'font-shape-gstring))))
> 
> They are now once more rendered correctly.

As expected, see above: now you do have a regexp that can match, it's
this one:

    (concat independent-vowel nasal "?")

I hope you now understand how to fix the rules.  If not, please ask
more questions and show more examples.





  reply	other threads:[~2022-05-12 16:29 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-11 15:01 bug#55370: [PATCH] Add support for the Syloti Nagri script समीर सिंह Sameer Singh
2022-05-12  7:10 ` Eli Zaretskii
2022-05-12 13:42   ` समीर सिंह Sameer Singh
2022-05-12 14:01     ` Eli Zaretskii
2022-05-12 15:06       ` समीर सिंह Sameer Singh
2022-05-12 16:29         ` Eli Zaretskii [this message]
2022-05-12 16:50           ` समीर सिंह Sameer Singh
2022-05-12 17:04             ` Eli Zaretskii
2022-05-12 17:10               ` समीर सिंह Sameer Singh
2022-05-12 17:25                 ` Eli Zaretskii
2022-05-12 17:28                   ` समीर सिंह Sameer Singh
2022-05-14 23:47                     ` समीर सिंह Sameer Singh
2022-05-15  6:16                       ` Eli Zaretskii
2022-05-15 13:40                         ` समीर सिंह Sameer Singh
2022-05-15 14:23                           ` Eli Zaretskii
2022-05-15 14:41                             ` समीर सिंह Sameer Singh
2022-05-15 15:19                               ` Eli Zaretskii
2022-05-15 15:25                                 ` समीर सिंह Sameer Singh
2022-05-15 15:40                                   ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=837d6qpvdo.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=55370@debbugs.gnu.org \
    --cc=lumarzeli30@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).