unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Glenn Morris <rgm@gnu.org>
Cc: dr.khaled.hosny@gmail.com, behdad@behdad.org,
	33729@debbugs.gnu.org, far.nasiri.m@gmail.com,
	kaushal.modi@gmail.com
Subject: bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
Date: Mon, 17 Dec 2018 17:55:52 +0200	[thread overview]
Message-ID: <83mup4du5z.fsf@gnu.org> (raw)
In-Reply-To: <xk4lbd9erb.fsf@fencepost.gnu.org> (message from Glenn Morris on Sun, 16 Dec 2018 19:30:00 -0500)

> From: Glenn Morris <rgm@gnu.org>
> Cc: far.nasiri.m@gmail.com,  dr.khaled.hosny@gmail.com,  behdad@behdad.org,  33729@debbugs.gnu.org,  kaushal.modi@gmail.com
> Date: Sun, 16 Dec 2018 19:30:00 -0500
> 
> > After some thinking, my conclusion is that we should import the
> > ISO 15924 database from https://unicode.org/iso15924/, use a script
> > similar to admin/unidata/blocks.awk to generate an alist from it that
> > maps Emacs script names to ISO 15924 tags, and then access that alist
> > from uni_script to get the correct script information to Harfbuzz.
> >
> > Patches implementing that are welcome.
> 
> I live to write awk scripts. I'm not 100% sure what you want, but as a
> first example, the following takes
> http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
> as input and outputs lines of the form "(gujr . gujarati)".
> 
> The aliases are so that the RHS matches charscript.el.
> 
> If this is not right, please clarify exactly what the inputs and output
> should be.

Thanks.

It turns out I didn't have this figured out completely, and your
proposal forced me to dig some more into the relevant parts of Unicode
and Emacs.  I found a few additional issues and considerations; for at
least some of them I'd like to hear the opinions of the Harfbuzz
developers.

Here are the issues:

 . Contrary to my original thoughts, I now tend to think that a
   separate char-table, say char-iso159240tag-table, that maps
   character codepoints directly to the script tags, is a better
   solution:
    - it will allow a faster look up, obviously
    - the subdivision of characters into scripts, as shown in
      Unicode's Scripts.txt, is slightly different from what
      char-script-table does, so a simple mapping from Emacs scripts
      to ISO 15924 script tag will not do.  For example, many
      characters Emacs puts into 'latin' or 'symbol' scripts are in
      the Common script according to Scripts.txt, and similarly for
      the Inherited script.  I imagine this is important for
      Harfbuzz.

 . Whether to produce the character-to-script-tag mapping using the
   UCD files, such as Scripts.txt and PropertyValueAliases.txt, or the
   canonical ISO 15924 tags from https://unicode.org/iso15924/,
   depends on whether the slight differences mentioned in
   https://www.unicode.org/reports/tr24/#Relation_To_ISO15924 matter
   for Harfbuzz.  For example, ISO 15924 has separate tags for the
   Fraktur and Gaelic varieties of the Latin script: does this
   distinction matter for Harfbuzz?

 . Does Harfbuzz handle the issues mentioned in
   https://www.unicode.org/reports/tr24/#Script_Anomalies, and in
   particular the use case of decomposed characters which yield a
   different script than their precomposed variants?  This use case is
   quite common in handling of character compositions, so it's
   important to understand its implications before we decide on the
   implementation.

To summarize, unless the Harfbuzz guys advise differently, I'd prefer
processing Scripts.txt and PropertyValueAliases.txt into a list
similar to the one we produce in charscript.el, then generate a
char-table from that list.

Thanks again for working on this.





  reply	other threads:[~2018-12-17 15:55 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-13 20:20 bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n) Kaushal Modi
2018-12-13 20:25 ` Kaushal Modi
2018-12-13 20:31   ` Khaled Hosny
2018-12-13 20:43     ` Kaushal Modi
2018-12-13 20:53       ` Khaled Hosny
2018-12-13 21:04         ` Kaushal Modi
2018-12-14  5:57       ` Eli Zaretskii
2018-12-14  7:48         ` Eli Zaretskii
2018-12-14  7:50         ` Khaled Hosny
2018-12-14 10:03           ` Eli Zaretskii
2018-12-14 11:03             ` Khaled Hosny
2018-12-14 13:42               ` Eli Zaretskii
2018-12-14 15:25                 ` Eli Zaretskii
2018-12-17  0:30                   ` Glenn Morris
2018-12-17 15:55                     ` Eli Zaretskii [this message]
2018-12-20 18:58                       ` Eli Zaretskii
2018-12-20 20:45                         ` Behdad Esfahbod
2018-12-22  8:54                       ` Khaled Hosny
2018-12-22  9:06                         ` Khaled Hosny
2018-12-22 10:11                           ` Eli Zaretskii
2018-12-22 15:15                             ` Khaled Hosny
2018-12-22 15:27                               ` Behdad Esfahbod
2018-12-22 15:42                                 ` Khaled Hosny
2018-12-22 15:42                               ` Eli Zaretskii
2018-12-22 15:49                                 ` Khaled Hosny
2018-12-22 16:33                                   ` Eli Zaretskii
2018-12-22 19:38                                   ` Eli Zaretskii
2018-12-22 20:59                                     ` Khaled Hosny
2018-12-23  3:34                                       ` Eli Zaretskii
2018-12-23 13:51                                         ` Khaled Hosny
2018-12-23 16:00                                           ` Eli Zaretskii
2018-12-24  2:08                                             ` Khaled Hosny
2018-12-24  4:12                                               ` Kaushal Modi
2018-12-24 16:10                                               ` Eli Zaretskii
2018-12-24 17:37                                                 ` Khaled Hosny
2018-12-24 18:07                                                   ` Eli Zaretskii
2019-01-05 21:15                                                     ` Khaled Hosny
2019-01-06 16:03                                                       ` Eli Zaretskii
2019-01-27 17:12                                                         ` Eli Zaretskii
2019-01-29 22:25                                                           ` Khaled Hosny
2018-12-29 14:49                                                   ` Eli Zaretskii
2019-01-05 20:53                                                     ` Khaled Hosny
2019-01-05 21:04                                                       ` Khaled Hosny
2019-01-06 17:54                                                         ` Eli Zaretskii
2019-01-27 17:12                                                           ` Eli Zaretskii
2019-01-29 22:33                                                           ` Khaled Hosny
2019-01-06 15:50                                                       ` Eli Zaretskii
2019-01-29 22:29                                                         ` Khaled Hosny
2022-04-29 12:47                                                           ` Lars Ingebrigtsen
2022-04-29 13:24                                                             ` Eli Zaretskii
2019-01-27 17:09                                                       ` Eli Zaretskii
2018-12-24 17:38                           ` Benjamin Riefenstahl
2018-12-14 22:47                 ` Khaled Hosny
2018-12-16 14:47               ` Benjamin Riefenstahl
2018-12-14  6:45 ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83mup4du5z.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=33729@debbugs.gnu.org \
    --cc=behdad@behdad.org \
    --cc=dr.khaled.hosny@gmail.com \
    --cc=far.nasiri.m@gmail.com \
    --cc=kaushal.modi@gmail.com \
    --cc=rgm@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).