From: Khaled Hosny <dr.khaled.hosny@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: behdad@behdad.org, 33729@debbugs.gnu.org, far.nasiri.m@gmail.com,
kaushal.modi@gmail.com
Subject: bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
Date: Sat, 5 Jan 2019 23:15:14 +0200 [thread overview]
Message-ID: <20190105211514.GB28761@macbook.localdomain> (raw)
In-Reply-To: <83imzi94tz.fsf@gnu.org>
On Mon, Dec 24, 2018 at 08:07:04PM +0200, Eli Zaretskii wrote:
> > Date: Mon, 24 Dec 2018 19:37:23 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > > Per previous discussions, we decided to use the Harfbuzz built-in
> > > methods for determining the script, since Emacs doesn't have this
> > > information, and adding it will just do the same as Harfbuzz does,
> > > i.e. find the first character whose script is not Common etc., using
> > > the UCD database. I think it was you who suggested to use the
> > > Harfbuzz built-ins in this case.
> >
> > The built-in HarfBuzz code is for getting the script for a given
> > character, but resolving characters with Common script is left to the
> > client. Suppose you have this string (upper case for RTL) ABC 123 DEF,
> > what HarfBuzz sees during shaping is three separate chunks of text ABC,
> > 123, DEF. The 123 part is all Common script characters and thus
> > hb_buffer_guess_segment_properties won’t be able to guess anything (and
> > based on the font and the script, this can cause rendering differences).
> > Emacs will have to resolve the script of Common characters before
> > applying bidi algorithm and pass that down to HarfBuzz.
>
> I'm not sure I understand: why does HarfBuzz care that 123 was in the
> middle if RTL text.
It doesn’t. What it cares about here is the correct script. Because 123
are in the middle of RTL text they will be shaped separately, and thus
hb_buffer_guess_segment_properties() will only see 123 and won’t to be
able to guess the correct script for them (Arabic, Hebrew, etc.,
whatever the script for the surrounding RTL text is).
The point I’m trying to make is that script detection, even in its
simplest form, needs to be done on the text as a whole not just the
portion being shaped, which makes hb_buffer_guess_segment_properties()
ill equipped for doing this as it only sees a small portion of the text
at a time.
> Does it need to shape 123 specially in this case?
Depending on the font, the digits might be shaped differently if the
script is, say Arabic, by e.g. applying script-specific substitutions to
forms more suitable for a given script.
> (In general, AFAIK simple characters like 123 will not even go through
> HarfBuzz, as Emacs doesn't call the shaper for characters whose entry
> in composition-function-table is nil. So I guess 123 here should
> stand for some other characters, not for literal digits? IOW, I don't
> think I understand the example very well.)
This is a bug then and needs to be fixed. All text should go through
HarfBuzz since even so-called “simple” character often require shaping
depending on the text and the font. If this is done for optimization,
then it should be revised to see if shaping with HarfBuzz is actually
significantly slower and if it is, find more proper ways to optimize it.
Regards,
Khaled
next prev parent reply other threads:[~2019-01-05 21:15 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-13 20:20 bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n) Kaushal Modi
2018-12-13 20:25 ` Kaushal Modi
2018-12-13 20:31 ` Khaled Hosny
2018-12-13 20:43 ` Kaushal Modi
2018-12-13 20:53 ` Khaled Hosny
2018-12-13 21:04 ` Kaushal Modi
2018-12-14 5:57 ` Eli Zaretskii
2018-12-14 7:48 ` Eli Zaretskii
2018-12-14 7:50 ` Khaled Hosny
2018-12-14 10:03 ` Eli Zaretskii
2018-12-14 11:03 ` Khaled Hosny
2018-12-14 13:42 ` Eli Zaretskii
2018-12-14 15:25 ` Eli Zaretskii
2018-12-17 0:30 ` Glenn Morris
2018-12-17 15:55 ` Eli Zaretskii
2018-12-20 18:58 ` Eli Zaretskii
2018-12-20 20:45 ` Behdad Esfahbod
2018-12-22 8:54 ` Khaled Hosny
2018-12-22 9:06 ` Khaled Hosny
2018-12-22 10:11 ` Eli Zaretskii
2018-12-22 15:15 ` Khaled Hosny
2018-12-22 15:27 ` Behdad Esfahbod
2018-12-22 15:42 ` Khaled Hosny
2018-12-22 15:42 ` Eli Zaretskii
2018-12-22 15:49 ` Khaled Hosny
2018-12-22 16:33 ` Eli Zaretskii
2018-12-22 19:38 ` Eli Zaretskii
2018-12-22 20:59 ` Khaled Hosny
2018-12-23 3:34 ` Eli Zaretskii
2018-12-23 13:51 ` Khaled Hosny
2018-12-23 16:00 ` Eli Zaretskii
2018-12-24 2:08 ` Khaled Hosny
2018-12-24 4:12 ` Kaushal Modi
2018-12-24 16:10 ` Eli Zaretskii
2018-12-24 17:37 ` Khaled Hosny
2018-12-24 18:07 ` Eli Zaretskii
2019-01-05 21:15 ` Khaled Hosny [this message]
2019-01-06 16:03 ` Eli Zaretskii
2019-01-27 17:12 ` Eli Zaretskii
2019-01-29 22:25 ` Khaled Hosny
2018-12-29 14:49 ` Eli Zaretskii
2019-01-05 20:53 ` Khaled Hosny
2019-01-05 21:04 ` Khaled Hosny
2019-01-06 17:54 ` Eli Zaretskii
2019-01-27 17:12 ` Eli Zaretskii
2019-01-29 22:33 ` Khaled Hosny
2019-01-06 15:50 ` Eli Zaretskii
2019-01-29 22:29 ` Khaled Hosny
2022-04-29 12:47 ` Lars Ingebrigtsen
2022-04-29 13:24 ` Eli Zaretskii
2019-01-27 17:09 ` Eli Zaretskii
2018-12-24 17:38 ` Benjamin Riefenstahl
2018-12-14 22:47 ` Khaled Hosny
2018-12-16 14:47 ` Benjamin Riefenstahl
2018-12-14 6:45 ` Paul Eggert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190105211514.GB28761@macbook.localdomain \
--to=dr.khaled.hosny@gmail.com \
--cc=33729@debbugs.gnu.org \
--cc=behdad@behdad.org \
--cc=eliz@gnu.org \
--cc=far.nasiri.m@gmail.com \
--cc=kaushal.modi@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).