From: Eli Zaretskii <eliz@gnu.org>
To: Glenn Morris <rgm@gnu.org>
Cc: dr.khaled.hosny@gmail.com, behdad@behdad.org,
33729@debbugs.gnu.org, far.nasiri.m@gmail.com,
kaushal.modi@gmail.com
Subject: bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
Date: Mon, 17 Dec 2018 17:55:52 +0200 [thread overview]
Message-ID: <83mup4du5z.fsf@gnu.org> (raw)
In-Reply-To: <xk4lbd9erb.fsf@fencepost.gnu.org> (message from Glenn Morris on Sun, 16 Dec 2018 19:30:00 -0500)
> From: Glenn Morris <rgm@gnu.org>
> Cc: far.nasiri.m@gmail.com, dr.khaled.hosny@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> Date: Sun, 16 Dec 2018 19:30:00 -0500
>
> > After some thinking, my conclusion is that we should import the
> > ISO 15924 database from https://unicode.org/iso15924/, use a script
> > similar to admin/unidata/blocks.awk to generate an alist from it that
> > maps Emacs script names to ISO 15924 tags, and then access that alist
> > from uni_script to get the correct script information to Harfbuzz.
> >
> > Patches implementing that are welcome.
>
> I live to write awk scripts. I'm not 100% sure what you want, but as a
> first example, the following takes
> http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
> as input and outputs lines of the form "(gujr . gujarati)".
>
> The aliases are so that the RHS matches charscript.el.
>
> If this is not right, please clarify exactly what the inputs and output
> should be.
Thanks.
It turns out I didn't have this figured out completely, and your
proposal forced me to dig some more into the relevant parts of Unicode
and Emacs. I found a few additional issues and considerations; for at
least some of them I'd like to hear the opinions of the Harfbuzz
developers.
Here are the issues:
. Contrary to my original thoughts, I now tend to think that a
separate char-table, say char-iso159240tag-table, that maps
character codepoints directly to the script tags, is a better
solution:
- it will allow a faster look up, obviously
- the subdivision of characters into scripts, as shown in
Unicode's Scripts.txt, is slightly different from what
char-script-table does, so a simple mapping from Emacs scripts
to ISO 15924 script tag will not do. For example, many
characters Emacs puts into 'latin' or 'symbol' scripts are in
the Common script according to Scripts.txt, and similarly for
the Inherited script. I imagine this is important for
Harfbuzz.
. Whether to produce the character-to-script-tag mapping using the
UCD files, such as Scripts.txt and PropertyValueAliases.txt, or the
canonical ISO 15924 tags from https://unicode.org/iso15924/,
depends on whether the slight differences mentioned in
https://www.unicode.org/reports/tr24/#Relation_To_ISO15924 matter
for Harfbuzz. For example, ISO 15924 has separate tags for the
Fraktur and Gaelic varieties of the Latin script: does this
distinction matter for Harfbuzz?
. Does Harfbuzz handle the issues mentioned in
https://www.unicode.org/reports/tr24/#Script_Anomalies, and in
particular the use case of decomposed characters which yield a
different script than their precomposed variants? This use case is
quite common in handling of character compositions, so it's
important to understand its implications before we decide on the
implementation.
To summarize, unless the Harfbuzz guys advise differently, I'd prefer
processing Scripts.txt and PropertyValueAliases.txt into a list
similar to the one we produce in charscript.el, then generate a
char-table from that list.
Thanks again for working on this.
next prev parent reply other threads:[~2018-12-17 15:55 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-13 20:20 bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n) Kaushal Modi
2018-12-13 20:25 ` Kaushal Modi
2018-12-13 20:31 ` Khaled Hosny
2018-12-13 20:43 ` Kaushal Modi
2018-12-13 20:53 ` Khaled Hosny
2018-12-13 21:04 ` Kaushal Modi
2018-12-14 5:57 ` Eli Zaretskii
2018-12-14 7:48 ` Eli Zaretskii
2018-12-14 7:50 ` Khaled Hosny
2018-12-14 10:03 ` Eli Zaretskii
2018-12-14 11:03 ` Khaled Hosny
2018-12-14 13:42 ` Eli Zaretskii
2018-12-14 15:25 ` Eli Zaretskii
2018-12-17 0:30 ` Glenn Morris
2018-12-17 15:55 ` Eli Zaretskii [this message]
2018-12-20 18:58 ` Eli Zaretskii
2018-12-20 20:45 ` Behdad Esfahbod
2018-12-22 8:54 ` Khaled Hosny
2018-12-22 9:06 ` Khaled Hosny
2018-12-22 10:11 ` Eli Zaretskii
2018-12-22 15:15 ` Khaled Hosny
2018-12-22 15:27 ` Behdad Esfahbod
2018-12-22 15:42 ` Khaled Hosny
2018-12-22 15:42 ` Eli Zaretskii
2018-12-22 15:49 ` Khaled Hosny
2018-12-22 16:33 ` Eli Zaretskii
2018-12-22 19:38 ` Eli Zaretskii
2018-12-22 20:59 ` Khaled Hosny
2018-12-23 3:34 ` Eli Zaretskii
2018-12-23 13:51 ` Khaled Hosny
2018-12-23 16:00 ` Eli Zaretskii
2018-12-24 2:08 ` Khaled Hosny
2018-12-24 4:12 ` Kaushal Modi
2018-12-24 16:10 ` Eli Zaretskii
2018-12-24 17:37 ` Khaled Hosny
2018-12-24 18:07 ` Eli Zaretskii
2019-01-05 21:15 ` Khaled Hosny
2019-01-06 16:03 ` Eli Zaretskii
2019-01-27 17:12 ` Eli Zaretskii
2019-01-29 22:25 ` Khaled Hosny
2018-12-29 14:49 ` Eli Zaretskii
2019-01-05 20:53 ` Khaled Hosny
2019-01-05 21:04 ` Khaled Hosny
2019-01-06 17:54 ` Eli Zaretskii
2019-01-27 17:12 ` Eli Zaretskii
2019-01-29 22:33 ` Khaled Hosny
2019-01-06 15:50 ` Eli Zaretskii
2019-01-29 22:29 ` Khaled Hosny
2022-04-29 12:47 ` Lars Ingebrigtsen
2022-04-29 13:24 ` Eli Zaretskii
2019-01-27 17:09 ` Eli Zaretskii
2018-12-24 17:38 ` Benjamin Riefenstahl
2018-12-14 22:47 ` Khaled Hosny
2018-12-16 14:47 ` Benjamin Riefenstahl
2018-12-14 6:45 ` Paul Eggert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83mup4du5z.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=33729@debbugs.gnu.org \
--cc=behdad@behdad.org \
--cc=dr.khaled.hosny@gmail.com \
--cc=far.nasiri.m@gmail.com \
--cc=kaushal.modi@gmail.com \
--cc=rgm@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).