* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
@ 2018-12-13 20:20 Kaushal Modi
2018-12-13 20:25 ` Kaushal Modi
2018-12-14 6:45 ` Paul Eggert
0 siblings, 2 replies; 55+ messages in thread
From: Kaushal Modi @ 2018-12-13 20:20 UTC (permalink / raw)
To: 33729; +Cc: dr.khaled.hosny, behdad, far.nasiri.m
[-- Attachment #1: Type: text/plain, Size: 2136 bytes --]
Hello,
I built emacs from harfbuzz branch with harfbuzz 1.0.3 installed (RHEL 6.8).
I quickly compared Hindi and Gujarati rendering difference between emacs
built with m17n vs the new harfbuzz branch build.
With harfbuzz, it does not render the partial glyphs for Gujarati, but does
it fine for Hindi. But on the build with m17n, both Hindi and Gujarati show
that partial glyph rendered fine.
Screenshot to explain this issue: https://i.imgtc.com/md9Yz7X.png
In GNU Emacs 27.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 2.24.23)
of 2018-12-13
Repository revision: 981b3d292aff49452c2b5f0217b57ec1a2829a8b
Repository branch: harfbuzz
Windowing system distributor 'The X.Org Foundation', version 11.0.60900000
System Description: Red Hat Enterprise Linux Workstation release 6.8
(Santiago)
Recent messages:
Emacs version: GNU Emacs 27.0.50 (build 2, x86_64-pc-linux-gnu, GTK+
Version 2.24.23)
of 2018-12-13, built using commit 981b3d292aff49452c2b5f0217b57ec1a2829a8b.
./configure options:
--with-modules --prefix=/home/kmodi/usr_local/apps/6/emacs/harfbuzz
'--program-transform-name=s/^ctags$/ctags_emacs/' --with-harfbuzz
'CPPFLAGS=-I/home/kmodi/stowed/include -I/home/kmodi/usr_local/6/include
-I/usr/include/freetype2 -I/usr/include' 'CFLAGS=-O2 -march=native'
'LDFLAGS=-L/home/kmodi/stowed/lib -L/home/kmodi/stowed/lib64
-L/home/kmodi/usr_local/6/lib -L/home/kmodi/usr_local/6/lib64'
PKG_CONFIG_PATH=/home/kmodi/usr_local/6/lib/pkgconfig:/home/kmodi/usr_local/6/lib64/pkgconfig:/cad/adi/apps/gnu/linux/x86_64/6/lib/pkgconfig:/cad/adi/apps/gnu/linux/x86_64/6/lib64/pkgconfig:/home/kmodi/stowed/lib/pkgconfig:/usr/lib/pkgconfig:/usr/lib64/pkgconfig:/usr/share/pkgconfig:/lib/pkgconfig:/lib64/pkgconfig
Features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS GLIB
NOTIFY INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ M17N_FLT
LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS GTK2 X11 XDBE XIM MODULES THREADS GMP
Important settings:
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: @im=none
locale-coding-system: utf-8-unix
--
Kaushal Modi
[-- Attachment #2: Type: text/html, Size: 2520 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-13 20:20 bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n) Kaushal Modi
@ 2018-12-13 20:25 ` Kaushal Modi
2018-12-13 20:31 ` Khaled Hosny
2018-12-14 6:45 ` Paul Eggert
1 sibling, 1 reply; 55+ messages in thread
From: Kaushal Modi @ 2018-12-13 20:25 UTC (permalink / raw)
To: 33729; +Cc: dr.khaled.hosny, behdad, far.nasiri.m
[-- Attachment #1: Type: text/plain, Size: 221 bytes --]
>
> Screenshot to explain this issue: https://i.imgtc.com/md9Yz7X.png
>
I don't know Arabic. But from that same screenshot, it's evident that the
rendering of that same text is quite different between m17n and harfbuzz.
[-- Attachment #2: Type: text/html, Size: 564 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-13 20:25 ` Kaushal Modi
@ 2018-12-13 20:31 ` Khaled Hosny
2018-12-13 20:43 ` Kaushal Modi
0 siblings, 1 reply; 55+ messages in thread
From: Khaled Hosny @ 2018-12-13 20:31 UTC (permalink / raw)
To: Kaushal Modi; +Cc: behdad, 33729, far.nasiri.m
On Thu, Dec 13, 2018 at 03:25:16PM -0500, Kaushal Modi wrote:
> >
> > Screenshot to explain this issue: https://i.imgtc.com/md9Yz7X.png
> >
>
> I don't know Arabic. But from that same screenshot, it's evident that the
> rendering of that same text is quite different between m17n and harfbuzz.
The HarfBuzz rendering of Arabic is the correct one in this screenshot.
For debugging the such rendering differences, the actual font used by
Emacs for a given part of the text need to be known, then the text and
the font can be checked against vanilla HarfBuzz (e.g. using the hb-view
command line tool); if it gives the same rendering then it is either a
HarfBuzz or font issue, if not then it is a bug in the HarfBuzz
integration code in Emacs.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-13 20:31 ` Khaled Hosny
@ 2018-12-13 20:43 ` Kaushal Modi
2018-12-13 20:53 ` Khaled Hosny
2018-12-14 5:57 ` Eli Zaretskii
0 siblings, 2 replies; 55+ messages in thread
From: Kaushal Modi @ 2018-12-13 20:43 UTC (permalink / raw)
To: dr.khaled.hosny; +Cc: behdad, 33729, far.nasiri.m
[-- Attachment #1: Type: text/plain, Size: 6423 bytes --]
On Thu, Dec 13, 2018 at 3:31 PM Khaled Hosny <dr.khaled.hosny@gmail.com>
wrote:
>
> The HarfBuzz rendering of Arabic is the correct one in this screenshot.
>
Thanks. So here's the status so far:
Rendering of Namaste as seen in C-h h (M-x view-hello-file):
| | harfbuzz | m17b |
|----------+----------+---------|
| Hindi | correct | correct |
| Gujarati | wrong | correct |
| Arabic | correct | wrong |
> For debugging the such rendering differences, the actual font used by
> Emacs for a given part of the text need to be known,
I am using Mukta Vaani font for Gujarati. It is a free font and be
downloaded from https://ektype.in/mukta-vaani.html.
The string being rendered is "નમસ્તે".
By placing the cursor on each of those characters and doing C-u x = (on the
m17n build), I get:
(1) ન
position: 1610 of 3509 (46%), column: 32
character: ન (displayed as ન) (codepoint 2728, #o5250, #xaa8)
charset: mule-unicode-0100-24ff (Unicode characters of the
range U+0100..U+24FF.)
code point in charset: 0x3968
script: gujarati
syntax: w which means: word
category: .:Base, L:Left-to-right (strong)
to input: type "C-x 8 RET aa8" or "C-x 8 RET GUJARATI LETTER
NA"
buffer code: #xE0 #xAA #xA8
file code: #xE0 #xAA #xA8 (encoded by coding system utf-8-unix)
display: by this font (glyph code)
xft:-unknown-Mukta Vaani-normal-normal-normal-*-18-*-*-*-*-0-iso10646-1
(#x234)
Character code properties: customize what to show
name: GUJARATI LETTER NA
general-category: Lo (Letter, Other)
decomposition: (2728) ('ન')
There are text properties here:
charset mule-unicode-0100-24ff
(2) મ
position: 1611 of 3509 (46%), column: 33
character: મ (displayed as મ) (codepoint 2734, #o5256, #xaae)
charset: mule-unicode-0100-24ff (Unicode characters of the
range U+0100..U+24FF.)
code point in charset: 0x396E
script: gujarati
syntax: w which means: word
category: .:Base, L:Left-to-right (strong)
to input: type "C-x 8 RET aae" or "C-x 8 RET GUJARATI LETTER
MA"
buffer code: #xE0 #xAA #xAE
file code: #xE0 #xAA #xAE (encoded by coding system utf-8-unix)
display: by this font (glyph code)
xft:-unknown-Mukta Vaani-normal-normal-normal-*-18-*-*-*-*-0-iso10646-1
(#x239)
Character code properties: customize what to show
name: GUJARATI LETTER MA
general-category: Lo (Letter, Other)
decomposition: (2734) ('મ')
There are text properties here:
charset mule-unicode-0100-24ff
(3) સ્તે
position: 1612 of 3509 (46%), column: 34
character: સ (displayed as સ) (codepoint 2744, #o5270, #xab8)
charset: mule-unicode-0100-24ff (Unicode characters of the
range U+0100..U+24FF.)
code point in charset: 0x3978
script: gujarati
syntax: w which means: word
category: .:Base, L:Left-to-right (strong)
to input: type "C-x 8 RET ab8" or "C-x 8 RET GUJARATI LETTER
SA"
buffer code: #xE0 #xAA #xB8
file code: #xE0 #xAA #xB8 (encoded by coding system utf-8-unix)
display: composed to form "સ્તે" (see below)
Composed with the following character(s) "્તે" using this font:
xft:-unknown-Mukta Vaani-normal-normal-normal-*-18-*-*-*-*-0-iso10646-1
by these glyphs:
[0 3 0 645 8 0 11 11 0 [0 0 8]]
[0 3 2724 560 11 1 11 11 1 nil]
[0 3 2759 589 0 -9 -2 16 -11 [-1 0 0]]
Character code properties: customize what to show
name: GUJARATI LETTER SA
general-category: Lo (Letter, Other)
decomposition: (2744) ('સ')
There are text properties here:
charset mule-unicode-0100-24ff
=====
On harfbuzz build, the "સ્તે" part is different.. I can place the cursor
separately on સ્ and તે, do C-u x = and I get:
(3.1) સ્
position: 1612 of 3509 (46%), column: 34
character: સ (displayed as સ) (codepoint 2744, #o5270, #xab8)
charset: mule-unicode-0100-24ff (Unicode characters of the
range U+0100..U+24FF.)
code point in charset: 0x3978
script: gujarati
syntax: w which means: word
category: .:Base, L:Left-to-right (strong)
to input: type "C-x 8 RET ab8" or "C-x 8 RET GUJARATI LETTER
SA"
buffer code: #xE0 #xAA #xB8
file code: #xE0 #xAA #xB8 (encoded by coding system utf-8-unix)
display: by this font (glyph code)
xft:-unknown-Mukta Vaani-normal-normal-normal-*-18-*-*-*-*-0-iso10646-1
(#x241)
Character code properties: customize what to show
name: GUJARATI LETTER SA
general-category: Lo (Letter, Other)
decomposition: (2744) ('સ')
There are text properties here:
charset mule-unicode-0100-24ff
(3.2) તે
position: 1614 of 3509 (46%), column: 35
character: ત (displayed as ત) (codepoint 2724, #o5244, #xaa4)
charset: mule-unicode-0100-24ff (Unicode characters of the
range U+0100..U+24FF.)
code point in charset: 0x3964
script: gujarati
syntax: w which means: word
category: .:Base, L:Left-to-right (strong)
to input: type "C-x 8 RET aa4" or "C-x 8 RET GUJARATI LETTER
TA"
buffer code: #xE0 #xAA #xA4
file code: #xE0 #xAA #xA4 (encoded by coding system utf-8-unix)
display: by this font (glyph code)
xft:-unknown-Mukta Vaani-normal-normal-normal-*-18-*-*-*-*-0-iso10646-1
(#x230)
Character code properties: customize what to show
name: GUJARATI LETTER TA
general-category: Lo (Letter, Other)
decomposition: (2724) ('ત')
There are text properties here:
charset mule-unicode-0100-24ff
then the text and
> the font can be checked against vanilla HarfBuzz (e.g. using the hb-view
> command line tool); if it gives the same rendering then it is either a
> HarfBuzz or font issue, if not then it is a bug in the HarfBuzz
> integration code in Emacs.
>
[-- Attachment #2: Type: text/html, Size: 8788 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-13 20:43 ` Kaushal Modi
@ 2018-12-13 20:53 ` Khaled Hosny
2018-12-13 21:04 ` Kaushal Modi
2018-12-14 5:57 ` Eli Zaretskii
1 sibling, 1 reply; 55+ messages in thread
From: Khaled Hosny @ 2018-12-13 20:53 UTC (permalink / raw)
To: Kaushal Modi; +Cc: behdad, 33729, far.nasiri.m
On Thu, Dec 13, 2018 at 03:43:50PM -0500, Kaushal Modi wrote:
> On Thu, Dec 13, 2018 at 3:31 PM Khaled Hosny <dr.khaled.hosny@gmail.com>
> wrote:
>
> >
> > The HarfBuzz rendering of Arabic is the correct one in this screenshot.
> >
>
> Thanks. So here's the status so far:
>
> Rendering of Namaste as seen in C-h h (M-x view-hello-file):
>
> | | harfbuzz | m17b |
> |----------+----------+---------|
> | Hindi | correct | correct |
> | Gujarati | wrong | correct |
> | Arabic | correct | wrong |
>
>
>
> > For debugging the such rendering differences, the actual font used by
> > Emacs for a given part of the text need to be known,
>
>
> I am using Mukta Vaani font for Gujarati. It is a free font and be
> downloaded from https://ektype.in/mukta-vaani.html.
>
> The string being rendered is "નમસ્તે".
I tried that font and text with hb-view and the output I get is
identical to m17b. If I pass a wrong script to HarfBuzz (e.g.
--script=latn), I get the same broken output you see in Emacs. So I’m
guessing something is not correctly working in script itemization. Most
likely the FIXME in uni_script(), or the FIXME above the call to
hb_buffer_guess_segment_properties().
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-13 20:53 ` Khaled Hosny
@ 2018-12-13 21:04 ` Kaushal Modi
0 siblings, 0 replies; 55+ messages in thread
From: Kaushal Modi @ 2018-12-13 21:04 UTC (permalink / raw)
To: dr.khaled.hosny; +Cc: behdad, 33729, far.nasiri.m
[-- Attachment #1: Type: text/plain, Size: 1481 bytes --]
On Thu, Dec 13, 2018 at 3:53 PM Khaled Hosny <dr.khaled.hosny@gmail.com>
wrote:
>
> I tried that font and text with hb-view and the output I get is
> identical to m17b.
hb-view is nifty! I wasn't sure if it would work for me (because I haven't
set my terminal to show unicode, etc.). But even with the older Harfbuzz
1.0.3 that I have, hb-view gave this: https://i.imgtc.com/d1N177Z.png
I am impressed. That shows the correct rendering of નમસ્તે. (I just blindly
pasted નમસ્તે as the second argument and hit enter, my terminal doesn't
even show the pasted text. But the hb-view rendering is correct.)
> If I pass a wrong script to HarfBuzz (e.g.
> --script=latn), I get the same broken output you see in Emacs. So I’m
> guessing something is not correctly working in script itemization. Most
> likely the FIXME in uni_script(), or the FIXME above the call to
> hb_buffer_guess_segment_properties().
>
I am not a C developer. But hopefully this information would help you to
fix the Harfbuzz integration with Emacs.
I am surprised that the rendering of Hindi नमस्ते using Harfbuzz in Emacs
is correct, while the rendering of Gujarati નમસ્તે is not, when in fact
the two scripts are so similar to each other. [Fun fact: Most of Gujarati
script if superimposed with a line at the top will look like valid Hindi.
You can see that in the case of નમસ્તે vs नमस्ते :) ]
[-- Attachment #2: Type: text/html, Size: 2095 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-13 20:43 ` Kaushal Modi
2018-12-13 20:53 ` Khaled Hosny
@ 2018-12-14 5:57 ` Eli Zaretskii
2018-12-14 7:48 ` Eli Zaretskii
2018-12-14 7:50 ` Khaled Hosny
1 sibling, 2 replies; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-14 5:57 UTC (permalink / raw)
To: Kaushal Modi; +Cc: dr.khaled.hosny, behdad, 33729, far.nasiri.m
> From: Kaushal Modi <kaushal.modi@gmail.com>
> Date: Thu, 13 Dec 2018 15:43:50 -0500
> Cc: behdad@behdad.org, 33729@debbugs.gnu.org, far.nasiri.m@gmail.com
>
> For debugging the such rendering differences, the actual font used by
> Emacs for a given part of the text need to be known,
>
> I am using Mukta Vaani font for Gujarati. It is a free font and be downloaded from
> https://ektype.in/mukta-vaani.html.
Your data indicates that the m17n build performs character composition
at buffer position 34, whereas the harfbuzz build does not. The
question is why.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-13 20:20 bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n) Kaushal Modi
2018-12-13 20:25 ` Kaushal Modi
@ 2018-12-14 6:45 ` Paul Eggert
1 sibling, 0 replies; 55+ messages in thread
From: Paul Eggert @ 2018-12-14 6:45 UTC (permalink / raw)
To: 33729; +Cc: dr.khaled.hosny, behdad, Florian Beck, far.nasiri.m, Kaushal Modi
Florian Beck pointed out some examples of possible related problems when
rendering Emacs's etc/HELLO file; see:
https://lists.gnu.org/r/emacs-devel/2018-12/msg00271.html
For the names of the languages in the languages, Harfbuzz seems to be better for
Burmese (မြန်မာ) (where master is wrong); conversely Harfbuzz seems to be wrong
for Maldivian (ދިވެހި) (where master is better). Please see the following for
what these should look like:
https://en.wikipedia.org/wiki/File:Dhivehiscript.svg
https://en.wikipedia.org/wiki/File:Burmese_script_sample.svg
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-14 5:57 ` Eli Zaretskii
@ 2018-12-14 7:48 ` Eli Zaretskii
2018-12-14 7:50 ` Khaled Hosny
1 sibling, 0 replies; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-14 7:48 UTC (permalink / raw)
To: kaushal.modi; +Cc: dr.khaled.hosny, behdad, 33729, far.nasiri.m
> Date: Fri, 14 Dec 2018 07:57:55 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: dr.khaled.hosny@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org,
> far.nasiri.m@gmail.com
>
> Your data indicates that the m17n build performs character composition
> at buffer position 34
Sorry, wrong number: I meant buffer position 1612.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-14 5:57 ` Eli Zaretskii
2018-12-14 7:48 ` Eli Zaretskii
@ 2018-12-14 7:50 ` Khaled Hosny
2018-12-14 10:03 ` Eli Zaretskii
1 sibling, 1 reply; 55+ messages in thread
From: Khaled Hosny @ 2018-12-14 7:50 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, Kaushal Modi, 33729, far.nasiri.m
On Fri, Dec 14, 2018 at 07:57:55AM +0200, Eli Zaretskii wrote:
> > From: Kaushal Modi <kaushal.modi@gmail.com>
> > Date: Thu, 13 Dec 2018 15:43:50 -0500
> > Cc: behdad@behdad.org, 33729@debbugs.gnu.org, far.nasiri.m@gmail.com
> >
> > For debugging the such rendering differences, the actual font used by
> > Emacs for a given part of the text need to be known,
> >
> > I am using Mukta Vaani font for Gujarati. It is a free font and be downloaded from
> > https://ektype.in/mukta-vaani.html.
>
> Your data indicates that the m17n build performs character composition
> at buffer position 34, whereas the harfbuzz build does not. The
> question is why.
See my earlier email, most likely the culprit is the broken Emacs to
HarfBuzz script code mapping that we discussed earlier. HarfBuzz needs
to know the correct script of the text to perform shaping, and it looks
like we are passing nonsense values for certain scripts (or rather for
certain scripts we are lucky that the mapping is not broken).
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-14 7:50 ` Khaled Hosny
@ 2018-12-14 10:03 ` Eli Zaretskii
2018-12-14 11:03 ` Khaled Hosny
0 siblings, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-14 10:03 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, kaushal.modi, 33729, far.nasiri.m
> Date: Fri, 14 Dec 2018 09:50:56 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: Kaushal Modi <kaushal.modi@gmail.com>, behdad@behdad.org,
> 33729@debbugs.gnu.org, far.nasiri.m@gmail.com
>
> > Your data indicates that the m17n build performs character composition
> > at buffer position 34, whereas the harfbuzz build does not. The
> > question is why.
>
> See my earlier email, most likely the culprit is the broken Emacs to
> HarfBuzz script code mapping that we discussed earlier. HarfBuzz needs
> to know the correct script of the text to perform shaping, and it looks
> like we are passing nonsense values for certain scripts (or rather for
> certain scripts we are lucky that the mapping is not broken).
Thanks.
I don't yet have access to a GNU/Linux system with HarfBuzz installed,
so I cannot myself debug it.
I hope Mohammad will be able to look into this and either fix it or
provide more focused and detailed analysis of what is wrong, so we
could fix it. Or maybe you could point to the problematic code and
tell more details.
FWIW, I looked at ftfont.c:uni_script, and I cannot find a problem
with it; in particular looking up in char-script-table each character
of the Gujarati welcome in HELLO yields 'gujarati', so I couldn't see
any evident Emacs issue. Or are you saying that hb_script_from_string
doesn't DTRT? Or maybe Kaushal should upgrade to a newer version of
HarfBuzz?
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-14 10:03 ` Eli Zaretskii
@ 2018-12-14 11:03 ` Khaled Hosny
2018-12-14 13:42 ` Eli Zaretskii
2018-12-16 14:47 ` Benjamin Riefenstahl
0 siblings, 2 replies; 55+ messages in thread
From: Khaled Hosny @ 2018-12-14 11:03 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, kaushal.modi, 33729, far.nasiri.m
On Fri, Dec 14, 2018 at 12:03:32PM +0200, Eli Zaretskii wrote:
> > Date: Fri, 14 Dec 2018 09:50:56 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: Kaushal Modi <kaushal.modi@gmail.com>, behdad@behdad.org,
> > 33729@debbugs.gnu.org, far.nasiri.m@gmail.com
> >
> > > Your data indicates that the m17n build performs character composition
> > > at buffer position 34, whereas the harfbuzz build does not. The
> > > question is why.
> >
> > See my earlier email, most likely the culprit is the broken Emacs to
> > HarfBuzz script code mapping that we discussed earlier. HarfBuzz needs
> > to know the correct script of the text to perform shaping, and it looks
> > like we are passing nonsense values for certain scripts (or rather for
> > certain scripts we are lucky that the mapping is not broken).
>
> Thanks.
>
> I don't yet have access to a GNU/Linux system with HarfBuzz installed,
> so I cannot myself debug it.
>
> I hope Mohammad will be able to look into this and either fix it or
> provide more focused and detailed analysis of what is wrong, so we
> could fix it. Or maybe you could point to the problematic code and
> tell more details.
>
> FWIW, I looked at ftfont.c:uni_script, and I cannot find a problem
> with it; in particular looking up in char-script-table each character
> of the Gujarati welcome in HELLO yields 'gujarati', so I couldn't see
> any evident Emacs issue. Or are you saying that hb_script_from_string
> doesn't DTRT? Or maybe Kaushal should upgrade to a newer version of
> HarfBuzz?
There is this FIXME:
/* FIXME: from_string wants an ISO 15924 script tag here. */
As we discussed earlier, hb_script_from_string() expects ISO 15924
script tags, but char_script_table does not provide such tags (I don’t
recall what it does provide exactly). We need a way to get ISO 15924
script tags from Emacs.
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-14 11:03 ` Khaled Hosny
@ 2018-12-14 13:42 ` Eli Zaretskii
2018-12-14 15:25 ` Eli Zaretskii
2018-12-14 22:47 ` Khaled Hosny
2018-12-16 14:47 ` Benjamin Riefenstahl
1 sibling, 2 replies; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-14 13:42 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, kaushal.modi, 33729, far.nasiri.m
> Date: Fri, 14 Dec 2018 13:03:16 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: kaushal.modi@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org,
> far.nasiri.m@gmail.com
>
> > FWIW, I looked at ftfont.c:uni_script, and I cannot find a problem
> > with it; in particular looking up in char-script-table each character
> > of the Gujarati welcome in HELLO yields 'gujarati', so I couldn't see
> > any evident Emacs issue. Or are you saying that hb_script_from_string
> > doesn't DTRT? Or maybe Kaushal should upgrade to a newer version of
> > HarfBuzz?
>
> There is this FIXME:
>
> /* FIXME: from_string wants an ISO 15924 script tag here. */
>
> As we discussed earlier, hb_script_from_string() expects ISO 15924
> script tags, but char_script_table does not provide such tags (I don’t
> recall what it does provide exactly). We need a way to get ISO 15924
> script tags from Emacs.
Right, I forgot about that.
So you are saying that we need to generate Gujr instead of gujarati,
is that right?
Mohammad, do you need help in comping up with a solution? There's
otf-script-alist (see fontest.el), but it goes in the opposite
direction. We could use rassq (Frassq in C) to find the OTF script
tag by its Emacs symbol (which is returned by indexing into
Vchar_script_table), by looking in otf-script-alist.
Or maybe you prefer a seperat data structure, not limited to the OTF
tags?
Let me know if you need more help.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-14 13:42 ` Eli Zaretskii
@ 2018-12-14 15:25 ` Eli Zaretskii
2018-12-17 0:30 ` Glenn Morris
2018-12-14 22:47 ` Khaled Hosny
1 sibling, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-14 15:25 UTC (permalink / raw)
To: far.nasiri.m; +Cc: dr.khaled.hosny, behdad, 33729, kaushal.modi
> Date: Fri, 14 Dec 2018 15:42:49 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: behdad@behdad.org, kaushal.modi@gmail.com, 33729@debbugs.gnu.org,
> far.nasiri.m@gmail.com
>
> Mohammad, do you need help in comping up with a solution? There's
> otf-script-alist (see fontest.el), but it goes in the opposite
> direction. We could use rassq (Frassq in C) to find the OTF script
> tag by its Emacs symbol (which is returned by indexing into
> Vchar_script_table), by looking in otf-script-alist.
>
> Or maybe you prefer a separate data structure, not limited to the OTF
> tags?
After some thinking, my conclusion is that we should import the
ISO 15924 database from https://unicode.org/iso15924/, use a script
similar to admin/unidata/blocks.awk to generate an alist from it that
maps Emacs script names to ISO 15924 tags, and then access that alist
from uni_script to get the correct script information to Harfbuzz.
Patches implementing that are welcome.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-14 13:42 ` Eli Zaretskii
2018-12-14 15:25 ` Eli Zaretskii
@ 2018-12-14 22:47 ` Khaled Hosny
1 sibling, 0 replies; 55+ messages in thread
From: Khaled Hosny @ 2018-12-14 22:47 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, kaushal.modi, 33729, far.nasiri.m
On Fri, Dec 14, 2018 at 03:42:49PM +0200, Eli Zaretskii wrote:
> > Date: Fri, 14 Dec 2018 13:03:16 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: kaushal.modi@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org,
> > far.nasiri.m@gmail.com
> >
> > > FWIW, I looked at ftfont.c:uni_script, and I cannot find a problem
> > > with it; in particular looking up in char-script-table each character
> > > of the Gujarati welcome in HELLO yields 'gujarati', so I couldn't see
> > > any evident Emacs issue. Or are you saying that hb_script_from_string
> > > doesn't DTRT? Or maybe Kaushal should upgrade to a newer version of
> > > HarfBuzz?
> >
> > There is this FIXME:
> >
> > /* FIXME: from_string wants an ISO 15924 script tag here. */
> >
> > As we discussed earlier, hb_script_from_string() expects ISO 15924
> > script tags, but char_script_table does not provide such tags (I don’t
> > recall what it does provide exactly). We need a way to get ISO 15924
> > script tags from Emacs.
>
> Right, I forgot about that.
>
> So you are saying that we need to generate Gujr instead of gujarati,
> is that right?
Yes (and the equivalent for all other scripts, of course).
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-14 11:03 ` Khaled Hosny
2018-12-14 13:42 ` Eli Zaretskii
@ 2018-12-16 14:47 ` Benjamin Riefenstahl
1 sibling, 0 replies; 55+ messages in thread
From: Benjamin Riefenstahl @ 2018-12-16 14:47 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, kaushal.modi, 33729, far.nasiri.m
Khaled Hosny writes:
> /* FIXME: from_string wants an ISO 15924 script tag here. */
>
> As we discussed earlier, hb_script_from_string() expects ISO 15924
> script tags, but char_script_table does not provide such tags (I don’t
> recall what it does provide exactly). We need a way to get ISO 15924
> script tags from Emacs.
The same mismatch also prevents Syriac text from actually shaping.
Syriac shaping works in m17n with the required setup in
composition-function-table and using the Meltho fonts. With Harfbuzz it
doesn't work, unless I change "syriac" to "syrc" in charscript.el, just
for testing of course.
As a success story OTOH, Mandaic, using the Noto font, works OOTB ;-)
benny
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-14 15:25 ` Eli Zaretskii
@ 2018-12-17 0:30 ` Glenn Morris
2018-12-17 15:55 ` Eli Zaretskii
0 siblings, 1 reply; 55+ messages in thread
From: Glenn Morris @ 2018-12-17 0:30 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: dr.khaled.hosny, behdad, 33729, far.nasiri.m, kaushal.modi
Eli Zaretskii wrote:
> After some thinking, my conclusion is that we should import the
> ISO 15924 database from https://unicode.org/iso15924/, use a script
> similar to admin/unidata/blocks.awk to generate an alist from it that
> maps Emacs script names to ISO 15924 tags, and then access that alist
> from uni_script to get the correct script information to Harfbuzz.
>
> Patches implementing that are welcome.
I live to write awk scripts. I'm not 100% sure what you want, but as a
first example, the following takes
http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
as input and outputs lines of the form "(gujr . gujarati)".
The aliases are so that the RHS matches charscript.el.
If this is not right, please clarify exactly what the inputs and output
should be.
#!/usr/bin/awk -f
function name2alias(name) {
name = tolower(name)
if (name ~ /arabic/) return "arabic"
else if (name ~ /aramaic/) return "aramaic"
else if (name ~ /cypriot/) return "cypriot-syllabary"
else if (name ~ /katakana|hiragana/) return "kana"
else if (name ~ /myanmar/) return "burmese"
else if (name ~ /duployan|shorthand/) return "duployan-shorthand"
else if (name ~ /signwriting/) return "sutton-sign-writing"
sub(/^new_/, "", name)
sub(/_(hieroglyphs|cursive)$/, "", name)
gsub(/_/,"-",name)
return name
}
$1 == "sc" {
abbrev = tolower($3)
alias = name2alias($5)
if (alias ~ /^inherited|common|unknown/) next
print "(" abbrev, ".", alias ")"
}
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-17 0:30 ` Glenn Morris
@ 2018-12-17 15:55 ` Eli Zaretskii
2018-12-20 18:58 ` Eli Zaretskii
2018-12-22 8:54 ` Khaled Hosny
0 siblings, 2 replies; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-17 15:55 UTC (permalink / raw)
To: Glenn Morris; +Cc: dr.khaled.hosny, behdad, 33729, far.nasiri.m, kaushal.modi
> From: Glenn Morris <rgm@gnu.org>
> Cc: far.nasiri.m@gmail.com, dr.khaled.hosny@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> Date: Sun, 16 Dec 2018 19:30:00 -0500
>
> > After some thinking, my conclusion is that we should import the
> > ISO 15924 database from https://unicode.org/iso15924/, use a script
> > similar to admin/unidata/blocks.awk to generate an alist from it that
> > maps Emacs script names to ISO 15924 tags, and then access that alist
> > from uni_script to get the correct script information to Harfbuzz.
> >
> > Patches implementing that are welcome.
>
> I live to write awk scripts. I'm not 100% sure what you want, but as a
> first example, the following takes
> http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
> as input and outputs lines of the form "(gujr . gujarati)".
>
> The aliases are so that the RHS matches charscript.el.
>
> If this is not right, please clarify exactly what the inputs and output
> should be.
Thanks.
It turns out I didn't have this figured out completely, and your
proposal forced me to dig some more into the relevant parts of Unicode
and Emacs. I found a few additional issues and considerations; for at
least some of them I'd like to hear the opinions of the Harfbuzz
developers.
Here are the issues:
. Contrary to my original thoughts, I now tend to think that a
separate char-table, say char-iso159240tag-table, that maps
character codepoints directly to the script tags, is a better
solution:
- it will allow a faster look up, obviously
- the subdivision of characters into scripts, as shown in
Unicode's Scripts.txt, is slightly different from what
char-script-table does, so a simple mapping from Emacs scripts
to ISO 15924 script tag will not do. For example, many
characters Emacs puts into 'latin' or 'symbol' scripts are in
the Common script according to Scripts.txt, and similarly for
the Inherited script. I imagine this is important for
Harfbuzz.
. Whether to produce the character-to-script-tag mapping using the
UCD files, such as Scripts.txt and PropertyValueAliases.txt, or the
canonical ISO 15924 tags from https://unicode.org/iso15924/,
depends on whether the slight differences mentioned in
https://www.unicode.org/reports/tr24/#Relation_To_ISO15924 matter
for Harfbuzz. For example, ISO 15924 has separate tags for the
Fraktur and Gaelic varieties of the Latin script: does this
distinction matter for Harfbuzz?
. Does Harfbuzz handle the issues mentioned in
https://www.unicode.org/reports/tr24/#Script_Anomalies, and in
particular the use case of decomposed characters which yield a
different script than their precomposed variants? This use case is
quite common in handling of character compositions, so it's
important to understand its implications before we decide on the
implementation.
To summarize, unless the Harfbuzz guys advise differently, I'd prefer
processing Scripts.txt and PropertyValueAliases.txt into a list
similar to the one we produce in charscript.el, then generate a
char-table from that list.
Thanks again for working on this.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-17 15:55 ` Eli Zaretskii
@ 2018-12-20 18:58 ` Eli Zaretskii
2018-12-20 20:45 ` Behdad Esfahbod
2018-12-22 8:54 ` Khaled Hosny
1 sibling, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-20 18:58 UTC (permalink / raw)
To: dr.khaled.hosny, behdad, , far.nasiri.m; +Cc: 33729, kaushal.modi
Ping! Could someone on the Harfbuzz team please comment on the
thoughts below? Khaled, Mohammad, Behdad?
> Date: Mon, 17 Dec 2018 17:55:52 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: dr.khaled.hosny@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org,
> far.nasiri.m@gmail.com, kaushal.modi@gmail.com
>
> > From: Glenn Morris <rgm@gnu.org>
> > Cc: far.nasiri.m@gmail.com, dr.khaled.hosny@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > Date: Sun, 16 Dec 2018 19:30:00 -0500
> >
> > > After some thinking, my conclusion is that we should import the
> > > ISO 15924 database from https://unicode.org/iso15924/, use a script
> > > similar to admin/unidata/blocks.awk to generate an alist from it that
> > > maps Emacs script names to ISO 15924 tags, and then access that alist
> > > from uni_script to get the correct script information to Harfbuzz.
> > >
> > > Patches implementing that are welcome.
> >
> > I live to write awk scripts. I'm not 100% sure what you want, but as a
> > first example, the following takes
> > http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
> > as input and outputs lines of the form "(gujr . gujarati)".
> >
> > The aliases are so that the RHS matches charscript.el.
> >
> > If this is not right, please clarify exactly what the inputs and output
> > should be.
>
> Thanks.
>
> It turns out I didn't have this figured out completely, and your
> proposal forced me to dig some more into the relevant parts of Unicode
> and Emacs. I found a few additional issues and considerations; for at
> least some of them I'd like to hear the opinions of the Harfbuzz
> developers.
>
> Here are the issues:
>
> . Contrary to my original thoughts, I now tend to think that a
> separate char-table, say char-iso159240tag-table, that maps
> character codepoints directly to the script tags, is a better
> solution:
> - it will allow a faster look up, obviously
> - the subdivision of characters into scripts, as shown in
> Unicode's Scripts.txt, is slightly different from what
> char-script-table does, so a simple mapping from Emacs scripts
> to ISO 15924 script tag will not do. For example, many
> characters Emacs puts into 'latin' or 'symbol' scripts are in
> the Common script according to Scripts.txt, and similarly for
> the Inherited script. I imagine this is important for
> Harfbuzz.
>
> . Whether to produce the character-to-script-tag mapping using the
> UCD files, such as Scripts.txt and PropertyValueAliases.txt, or the
> canonical ISO 15924 tags from https://unicode.org/iso15924/,
> depends on whether the slight differences mentioned in
> https://www.unicode.org/reports/tr24/#Relation_To_ISO15924 matter
> for Harfbuzz. For example, ISO 15924 has separate tags for the
> Fraktur and Gaelic varieties of the Latin script: does this
> distinction matter for Harfbuzz?
>
> . Does Harfbuzz handle the issues mentioned in
> https://www.unicode.org/reports/tr24/#Script_Anomalies, and in
> particular the use case of decomposed characters which yield a
> different script than their precomposed variants? This use case is
> quite common in handling of character compositions, so it's
> important to understand its implications before we decide on the
> implementation.
>
> To summarize, unless the Harfbuzz guys advise differently, I'd prefer
> processing Scripts.txt and PropertyValueAliases.txt into a list
> similar to the one we produce in charscript.el, then generate a
> char-table from that list.
>
> Thanks again for working on this.
>
>
>
>
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-20 18:58 ` Eli Zaretskii
@ 2018-12-20 20:45 ` Behdad Esfahbod
0 siblings, 0 replies; 55+ messages in thread
From: Behdad Esfahbod @ 2018-12-20 20:45 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Khaled Hosny, 33729, Mohammad Nasirifar, kaushal.modi
[-- Attachment #1: Type: text/plain, Size: 3940 bytes --]
Sounds good to me.
On Thu, Dec 20, 2018 at 1:58 PM Eli Zaretskii <eliz@gnu.org> wrote:
> Ping! Could someone on the Harfbuzz team please comment on the
> thoughts below? Khaled, Mohammad, Behdad?
>
> > Date: Mon, 17 Dec 2018 17:55:52 +0200
> > From: Eli Zaretskii <eliz@gnu.org>
> > Cc: dr.khaled.hosny@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org,
> > far.nasiri.m@gmail.com, kaushal.modi@gmail.com
> >
> > > From: Glenn Morris <rgm@gnu.org>
> > > Cc: far.nasiri.m@gmail.com, dr.khaled.hosny@gmail.com,
> behdad@behdad.org, 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > > Date: Sun, 16 Dec 2018 19:30:00 -0500
> > >
> > > > After some thinking, my conclusion is that we should import the
> > > > ISO 15924 database from https://unicode.org/iso15924/, use a script
> > > > similar to admin/unidata/blocks.awk to generate an alist from it that
> > > > maps Emacs script names to ISO 15924 tags, and then access that alist
> > > > from uni_script to get the correct script information to Harfbuzz.
> > > >
> > > > Patches implementing that are welcome.
> > >
> > > I live to write awk scripts. I'm not 100% sure what you want, but as a
> > > first example, the following takes
> > > http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
> > > as input and outputs lines of the form "(gujr . gujarati)".
> > >
> > > The aliases are so that the RHS matches charscript.el.
> > >
> > > If this is not right, please clarify exactly what the inputs and output
> > > should be.
> >
> > Thanks.
> >
> > It turns out I didn't have this figured out completely, and your
> > proposal forced me to dig some more into the relevant parts of Unicode
> > and Emacs. I found a few additional issues and considerations; for at
> > least some of them I'd like to hear the opinions of the Harfbuzz
> > developers.
> >
> > Here are the issues:
> >
> > . Contrary to my original thoughts, I now tend to think that a
> > separate char-table, say char-iso159240tag-table, that maps
> > character codepoints directly to the script tags, is a better
> > solution:
> > - it will allow a faster look up, obviously
> > - the subdivision of characters into scripts, as shown in
> > Unicode's Scripts.txt, is slightly different from what
> > char-script-table does, so a simple mapping from Emacs scripts
> > to ISO 15924 script tag will not do. For example, many
> > characters Emacs puts into 'latin' or 'symbol' scripts are in
> > the Common script according to Scripts.txt, and similarly for
> > the Inherited script. I imagine this is important for
> > Harfbuzz.
> >
> > . Whether to produce the character-to-script-tag mapping using the
> > UCD files, such as Scripts.txt and PropertyValueAliases.txt, or the
> > canonical ISO 15924 tags from https://unicode.org/iso15924/,
> > depends on whether the slight differences mentioned in
> > https://www.unicode.org/reports/tr24/#Relation_To_ISO15924 matter
> > for Harfbuzz. For example, ISO 15924 has separate tags for the
> > Fraktur and Gaelic varieties of the Latin script: does this
> > distinction matter for Harfbuzz?
> >
> > . Does Harfbuzz handle the issues mentioned in
> > https://www.unicode.org/reports/tr24/#Script_Anomalies, and in
> > particular the use case of decomposed characters which yield a
> > different script than their precomposed variants? This use case is
> > quite common in handling of character compositions, so it's
> > important to understand its implications before we decide on the
> > implementation.
> >
> > To summarize, unless the Harfbuzz guys advise differently, I'd prefer
> > processing Scripts.txt and PropertyValueAliases.txt into a list
> > similar to the one we produce in charscript.el, then generate a
> > char-table from that list.
> >
> > Thanks again for working on this.
> >
> >
> >
> >
>
--
behdad
http://behdad.org/
[-- Attachment #2: Type: text/html, Size: 6215 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-17 15:55 ` Eli Zaretskii
2018-12-20 18:58 ` Eli Zaretskii
@ 2018-12-22 8:54 ` Khaled Hosny
2018-12-22 9:06 ` Khaled Hosny
1 sibling, 1 reply; 55+ messages in thread
From: Khaled Hosny @ 2018-12-22 8:54 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
On Mon, Dec 17, 2018 at 05:55:52PM +0200, Eli Zaretskii wrote:
> > From: Glenn Morris <rgm@gnu.org>
> > Cc: far.nasiri.m@gmail.com, dr.khaled.hosny@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > Date: Sun, 16 Dec 2018 19:30:00 -0500
> >
> > > After some thinking, my conclusion is that we should import the
> > > ISO 15924 database from https://unicode.org/iso15924/, use a script
> > > similar to admin/unidata/blocks.awk to generate an alist from it that
> > > maps Emacs script names to ISO 15924 tags, and then access that alist
> > > from uni_script to get the correct script information to Harfbuzz.
> > >
> > > Patches implementing that are welcome.
> >
> > I live to write awk scripts. I'm not 100% sure what you want, but as a
> > first example, the following takes
> > http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
> > as input and outputs lines of the form "(gujr . gujarati)".
> >
> > The aliases are so that the RHS matches charscript.el.
> >
> > If this is not right, please clarify exactly what the inputs and output
> > should be.
>
> Thanks.
>
> It turns out I didn't have this figured out completely, and your
> proposal forced me to dig some more into the relevant parts of Unicode
> and Emacs. I found a few additional issues and considerations; for at
> least some of them I'd like to hear the opinions of the Harfbuzz
> developers.
>
> Here are the issues:
>
> . Contrary to my original thoughts, I now tend to think that a
> separate char-table, say char-iso159240tag-table, that maps
> character codepoints directly to the script tags, is a better
> solution:
> - it will allow a faster look up, obviously
> - the subdivision of characters into scripts, as shown in
> Unicode's Scripts.txt, is slightly different from what
> char-script-table does, so a simple mapping from Emacs scripts
> to ISO 15924 script tag will not do. For example, many
> characters Emacs puts into 'latin' or 'symbol' scripts are in
> the Common script according to Scripts.txt, and similarly for
> the Inherited script. I imagine this is important for
> Harfbuzz.
Alternatively, we could just use HarfBuzz’s own built in ucdn-based
Unicode function for this. The only reason for overriding this in Emacs
was to keep HarfBuzz and Emacs Unicode support in sync, but if we are
going to duplicate the Unicode script data then better use what HarfBuzz
has.
I’m going to try this now.
> . Whether to produce the character-to-script-tag mapping using the
> UCD files, such as Scripts.txt and PropertyValueAliases.txt, or the
> canonical ISO 15924 tags from https://unicode.org/iso15924/,
> depends on whether the slight differences mentioned in
> https://www.unicode.org/reports/tr24/#Relation_To_ISO15924 matter
> for Harfbuzz. For example, ISO 15924 has separate tags for the
> Fraktur and Gaelic varieties of the Latin script: does this
> distinction matter for Harfbuzz?
We want the UCD data.
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 8:54 ` Khaled Hosny
@ 2018-12-22 9:06 ` Khaled Hosny
2018-12-22 10:11 ` Eli Zaretskii
2018-12-24 17:38 ` Benjamin Riefenstahl
0 siblings, 2 replies; 55+ messages in thread
From: Khaled Hosny @ 2018-12-22 9:06 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
On Sat, Dec 22, 2018 at 10:54:48AM +0200, Khaled Hosny wrote:
> On Mon, Dec 17, 2018 at 05:55:52PM +0200, Eli Zaretskii wrote:
> > > From: Glenn Morris <rgm@gnu.org>
> > > Cc: far.nasiri.m@gmail.com, dr.khaled.hosny@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > > Date: Sun, 16 Dec 2018 19:30:00 -0500
> > >
> > > > After some thinking, my conclusion is that we should import the
> > > > ISO 15924 database from https://unicode.org/iso15924/, use a script
> > > > similar to admin/unidata/blocks.awk to generate an alist from it that
> > > > maps Emacs script names to ISO 15924 tags, and then access that alist
> > > > from uni_script to get the correct script information to Harfbuzz.
> > > >
> > > > Patches implementing that are welcome.
> > >
> > > I live to write awk scripts. I'm not 100% sure what you want, but as a
> > > first example, the following takes
> > > http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
> > > as input and outputs lines of the form "(gujr . gujarati)".
> > >
> > > The aliases are so that the RHS matches charscript.el.
> > >
> > > If this is not right, please clarify exactly what the inputs and output
> > > should be.
> >
> > Thanks.
> >
> > It turns out I didn't have this figured out completely, and your
> > proposal forced me to dig some more into the relevant parts of Unicode
> > and Emacs. I found a few additional issues and considerations; for at
> > least some of them I'd like to hear the opinions of the Harfbuzz
> > developers.
> >
> > Here are the issues:
> >
> > . Contrary to my original thoughts, I now tend to think that a
> > separate char-table, say char-iso159240tag-table, that maps
> > character codepoints directly to the script tags, is a better
> > solution:
> > - it will allow a faster look up, obviously
> > - the subdivision of characters into scripts, as shown in
> > Unicode's Scripts.txt, is slightly different from what
> > char-script-table does, so a simple mapping from Emacs scripts
> > to ISO 15924 script tag will not do. For example, many
> > characters Emacs puts into 'latin' or 'symbol' scripts are in
> > the Common script according to Scripts.txt, and similarly for
> > the Inherited script. I imagine this is important for
> > Harfbuzz.
>
> Alternatively, we could just use HarfBuzz’s own built in ucdn-based
> Unicode function for this. The only reason for overriding this in Emacs
> was to keep HarfBuzz and Emacs Unicode support in sync, but if we are
> going to duplicate the Unicode script data then better use what HarfBuzz
> has.
>
> I’m going to try this now.
I pushed a commit to harfbuzz branch that I think fixes this issue now.
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 9:06 ` Khaled Hosny
@ 2018-12-22 10:11 ` Eli Zaretskii
2018-12-22 15:15 ` Khaled Hosny
2018-12-24 17:38 ` Benjamin Riefenstahl
1 sibling, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-22 10:11 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
> Date: Sat, 22 Dec 2018 11:06:44 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: Glenn Morris <rgm@gnu.org>, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> > Alternatively, we could just use HarfBuzz’s own built in ucdn-based
> > Unicode function for this. The only reason for overriding this in Emacs
> > was to keep HarfBuzz and Emacs Unicode support in sync, but if we are
> > going to duplicate the Unicode script data then better use what HarfBuzz
> > has.
> >
> > I’m going to try this now.
>
> I pushed a commit to harfbuzz branch that I think fixes this issue now.
Thanks.
There's a FIXME in the change you pushed (which I believe just
repeats what was already in the previous version). Ca you tell more
about the problem we need to fix there?
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 10:11 ` Eli Zaretskii
@ 2018-12-22 15:15 ` Khaled Hosny
2018-12-22 15:27 ` Behdad Esfahbod
2018-12-22 15:42 ` Eli Zaretskii
0 siblings, 2 replies; 55+ messages in thread
From: Khaled Hosny @ 2018-12-22 15:15 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
On Sat, Dec 22, 2018 at 12:11:15PM +0200, Eli Zaretskii wrote:
> > Date: Sat, 22 Dec 2018 11:06:44 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: Glenn Morris <rgm@gnu.org>, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > > Alternatively, we could just use HarfBuzz’s own built in ucdn-based
> > > Unicode function for this. The only reason for overriding this in Emacs
> > > was to keep HarfBuzz and Emacs Unicode support in sync, but if we are
> > > going to duplicate the Unicode script data then better use what HarfBuzz
> > > has.
> > >
> > > I’m going to try this now.
> >
> > I pushed a commit to harfbuzz branch that I think fixes this issue now.
>
> Thanks.
>
> There's a FIXME in the change you pushed (which I believe just
> repeats what was already in the previous version). Ca you tell more
> about the problem we need to fix there?
We need a way to get Unicode composition and decomposition for the
a given character (implementing the uni_compose and uni_decompose
functions I deleted). I recall you suggested something earlier that I
tried but couldn’t get to work, the exact detail escapes me.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 15:15 ` Khaled Hosny
@ 2018-12-22 15:27 ` Behdad Esfahbod
2018-12-22 15:42 ` Khaled Hosny
2018-12-22 15:42 ` Eli Zaretskii
1 sibling, 1 reply; 55+ messages in thread
From: Behdad Esfahbod @ 2018-12-22 15:27 UTC (permalink / raw)
To: Khaled Hosny; +Cc: 33729, Mohammad Nasirifar, Kaushal Modi
[-- Attachment #1: Type: text/plain, Size: 1435 bytes --]
I suggest you enabled UCDN.
On Sat, Dec 22, 2018 at 10:15 AM Khaled Hosny <dr.khaled.hosny@gmail.com>
wrote:
> On Sat, Dec 22, 2018 at 12:11:15PM +0200, Eli Zaretskii wrote:
> > > Date: Sat, 22 Dec 2018 11:06:44 +0200
> > > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > > Cc: Glenn Morris <rgm@gnu.org>, far.nasiri.m@gmail.com,
> behdad@behdad.org,
> > > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > >
> > > > Alternatively, we could just use HarfBuzz’s own built in ucdn-based
> > > > Unicode function for this. The only reason for overriding this in
> Emacs
> > > > was to keep HarfBuzz and Emacs Unicode support in sync, but if we are
> > > > going to duplicate the Unicode script data then better use what
> HarfBuzz
> > > > has.
> > > >
> > > > I’m going to try this now.
> > >
> > > I pushed a commit to harfbuzz branch that I think fixes this issue now.
> >
> > Thanks.
> >
> > There's a FIXME in the change you pushed (which I believe just
> > repeats what was already in the previous version). Ca you tell more
> > about the problem we need to fix there?
>
> We need a way to get Unicode composition and decomposition for the
> a given character (implementing the uni_compose and uni_decompose
> functions I deleted). I recall you suggested something earlier that I
> tried but couldn’t get to work, the exact detail escapes me.
>
--
behdad
http://behdad.org/
[-- Attachment #2: Type: text/html, Size: 2354 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 15:15 ` Khaled Hosny
2018-12-22 15:27 ` Behdad Esfahbod
@ 2018-12-22 15:42 ` Eli Zaretskii
2018-12-22 15:49 ` Khaled Hosny
1 sibling, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-22 15:42 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
> Date: Sat, 22 Dec 2018 17:15:09 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> > There's a FIXME in the change you pushed (which I believe just
> > repeats what was already in the previous version). Ca you tell more
> > about the problem we need to fix there?
>
> We need a way to get Unicode composition and decomposition for the
> a given character (implementing the uni_compose and uni_decompose
> functions I deleted).
Yes, but what does that entail? Are these compositions and
decompositions defined by the Unicode UCD? And how does Harfbuzz use
the results for a given character?
> I recall you suggested something earlier that I tried but couldn’t
> get to work, the exact detail escapes me.
I probably suggested using the 'decomposition' property of a
character, and perhaps als the facilities in ucs-normalize.el.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 15:27 ` Behdad Esfahbod
@ 2018-12-22 15:42 ` Khaled Hosny
0 siblings, 0 replies; 55+ messages in thread
From: Khaled Hosny @ 2018-12-22 15:42 UTC (permalink / raw)
To: Behdad Esfahbod; +Cc: 33729, Mohammad Nasirifar, Kaushal Modi
I’m sub-classing the default Unicode functions, so for the callback we
don’t implement the default implementation will be used already.
On Sat, Dec 22, 2018 at 10:27:06AM -0500, Behdad Esfahbod wrote:
> I suggest you enabled UCDN.
>
> On Sat, Dec 22, 2018 at 10:15 AM Khaled Hosny <dr.khaled.hosny@gmail.com>
> wrote:
>
> > On Sat, Dec 22, 2018 at 12:11:15PM +0200, Eli Zaretskii wrote:
> > > > Date: Sat, 22 Dec 2018 11:06:44 +0200
> > > > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > > > Cc: Glenn Morris <rgm@gnu.org>, far.nasiri.m@gmail.com,
> > behdad@behdad.org,
> > > > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > > >
> > > > > Alternatively, we could just use HarfBuzz’s own built in ucdn-based
> > > > > Unicode function for this. The only reason for overriding this in
> > Emacs
> > > > > was to keep HarfBuzz and Emacs Unicode support in sync, but if we are
> > > > > going to duplicate the Unicode script data then better use what
> > HarfBuzz
> > > > > has.
> > > > >
> > > > > I’m going to try this now.
> > > >
> > > > I pushed a commit to harfbuzz branch that I think fixes this issue now.
> > >
> > > Thanks.
> > >
> > > There's a FIXME in the change you pushed (which I believe just
> > > repeats what was already in the previous version). Ca you tell more
> > > about the problem we need to fix there?
> >
> > We need a way to get Unicode composition and decomposition for the
> > a given character (implementing the uni_compose and uni_decompose
> > functions I deleted). I recall you suggested something earlier that I
> > tried but couldn’t get to work, the exact detail escapes me.
> >
>
>
> --
> behdad
> http://behdad.org/
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 15:42 ` Eli Zaretskii
@ 2018-12-22 15:49 ` Khaled Hosny
2018-12-22 16:33 ` Eli Zaretskii
2018-12-22 19:38 ` Eli Zaretskii
0 siblings, 2 replies; 55+ messages in thread
From: Khaled Hosny @ 2018-12-22 15:49 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
On Sat, Dec 22, 2018 at 05:42:43PM +0200, Eli Zaretskii wrote:
> > Date: Sat, 22 Dec 2018 17:15:09 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > > There's a FIXME in the change you pushed (which I believe just
> > > repeats what was already in the previous version). Ca you tell more
> > > about the problem we need to fix there?
> >
> > We need a way to get Unicode composition and decomposition for the
> > a given character (implementing the uni_compose and uni_decompose
> > functions I deleted).
>
> Yes, but what does that entail? Are these compositions and
> decompositions defined by the Unicode UCD? And how does Harfbuzz use
> the results for a given character?
Yes, the standard Unicode composition and decomposition. HarfBuzz uses
these during shaping (it prefers composed form for a given sequence if
supported by the font, and falls back to decomposed form otherwise).
> > I recall you suggested something earlier that I tried but couldn’t
> > get to work, the exact detail escapes me.
>
> I probably suggested using the 'decomposition' property of a
> character, and perhaps als the facilities in ucs-normalize.el.
How can this be done from C.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 15:49 ` Khaled Hosny
@ 2018-12-22 16:33 ` Eli Zaretskii
2018-12-22 19:38 ` Eli Zaretskii
1 sibling, 0 replies; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-22 16:33 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
> Date: Sat, 22 Dec 2018 17:49:45 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> Yes, the standard Unicode composition and decomposition. HarfBuzz uses
> these during shaping (it prefers composed form for a given sequence if
> supported by the font, and falls back to decomposed form otherwise).
>
> > > I recall you suggested something earlier that I tried but couldn’t
> > > get to work, the exact detail escapes me.
> >
> > I probably suggested using the 'decomposition' property of a
> > character, and perhaps als the facilities in ucs-normalize.el.
>
> How can this be done from C.
There are several examples in the sources of calling Lisp from C. As
just a random example:
if (STRINGP (curdir))
val = call1 (intern ("file-remote-p"), curdir);
This calls the Lisp function file-remote-p with one argument, curdir.
If you tell me more about the arguments and the expected effects of
calling uni_compose and uni_decompose, maybe I could propose a
specific implementation. The Harfbuzz documentation doesn't seem to
tell enough, or maybe I didn't find the right text.
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 15:49 ` Khaled Hosny
2018-12-22 16:33 ` Eli Zaretskii
@ 2018-12-22 19:38 ` Eli Zaretskii
2018-12-22 20:59 ` Khaled Hosny
1 sibling, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-22 19:38 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
> Date: Sat, 22 Dec 2018 17:49:45 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> Yes, the standard Unicode composition and decomposition. HarfBuzz uses
> these during shaping (it prefers composed form for a given sequence if
> supported by the font, and falls back to decomposed form otherwise).
Btw, how is this problem solved in the other projects that use
Harfuzz? Does every project need to provide this functionality, or
does Harfuzz have it built-in, like with the script tags? If there's
built-in support for this, perhaps Emacs could just use that?
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 19:38 ` Eli Zaretskii
@ 2018-12-22 20:59 ` Khaled Hosny
2018-12-23 3:34 ` Eli Zaretskii
0 siblings, 1 reply; 55+ messages in thread
From: Khaled Hosny @ 2018-12-22 20:59 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
On Sat, Dec 22, 2018 at 09:38:43PM +0200, Eli Zaretskii wrote:
> > Date: Sat, 22 Dec 2018 17:49:45 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > Yes, the standard Unicode composition and decomposition. HarfBuzz uses
> > these during shaping (it prefers composed form for a given sequence if
> > supported by the font, and falls back to decomposed form otherwise).
>
> Btw, how is this problem solved in the other projects that use
> Harfuzz? Does every project need to provide this functionality, or
> does Harfuzz have it built-in, like with the script tags? If there's
> built-in support for this, perhaps Emacs could just use that?
There is built-in support, and currently we are using that. I can just
remove the FIXME.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 20:59 ` Khaled Hosny
@ 2018-12-23 3:34 ` Eli Zaretskii
2018-12-23 13:51 ` Khaled Hosny
0 siblings, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-23 3:34 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
> Date: Sat, 22 Dec 2018 22:59:48 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> On Sat, Dec 22, 2018 at 09:38:43PM +0200, Eli Zaretskii wrote:
> > > Date: Sat, 22 Dec 2018 17:49:45 +0200
> > > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > >
> > > Yes, the standard Unicode composition and decomposition. HarfBuzz uses
> > > these during shaping (it prefers composed form for a given sequence if
> > > supported by the font, and falls back to decomposed form otherwise).
> >
> > Btw, how is this problem solved in the other projects that use
> > Harfuzz? Does every project need to provide this functionality, or
> > does Harfuzz have it built-in, like with the script tags? If there's
> > built-in support for this, perhaps Emacs could just use that?
>
> There is built-in support, and currently we are using that. I can just
> remove the FIXME.
Are there any disadvantages in using the built-in support? I mean,
why did you envision an Emacs-specific implementation in the first
place?
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-23 3:34 ` Eli Zaretskii
@ 2018-12-23 13:51 ` Khaled Hosny
2018-12-23 16:00 ` Eli Zaretskii
0 siblings, 1 reply; 55+ messages in thread
From: Khaled Hosny @ 2018-12-23 13:51 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
On Sun, Dec 23, 2018 at 05:34:04AM +0200, Eli Zaretskii wrote:
> > Date: Sat, 22 Dec 2018 22:59:48 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > On Sat, Dec 22, 2018 at 09:38:43PM +0200, Eli Zaretskii wrote:
> > > > Date: Sat, 22 Dec 2018 17:49:45 +0200
> > > > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > > > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > > > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > > >
> > > > Yes, the standard Unicode composition and decomposition. HarfBuzz uses
> > > > these during shaping (it prefers composed form for a given sequence if
> > > > supported by the font, and falls back to decomposed form otherwise).
> > >
> > > Btw, how is this problem solved in the other projects that use
> > > Harfuzz? Does every project need to provide this functionality, or
> > > does Harfuzz have it built-in, like with the script tags? If there's
> > > built-in support for this, perhaps Emacs could just use that?
> >
> > There is built-in support, and currently we are using that. I can just
> > remove the FIXME.
>
> Are there any disadvantages in using the built-in support? I mean,
> why did you envision an Emacs-specific implementation in the first
> place?
I thought, but I might be mistaken, that Emacs allow changing these
character properties at runtime and someone might possibly want to use
that to change some character property (e.g. make some PUA character a
combining mark) and it would then be nice if HarfBuzz respected that. I
admit that is very niche thing if possible at all, and I’m more than
happy to let HarfBuzz use it default Unicode functions and simplify the
Emacs integration code.
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-23 13:51 ` Khaled Hosny
@ 2018-12-23 16:00 ` Eli Zaretskii
2018-12-24 2:08 ` Khaled Hosny
0 siblings, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-23 16:00 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
> Date: Sun, 23 Dec 2018 15:51:09 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> > Are there any disadvantages in using the built-in support? I mean,
> > why did you envision an Emacs-specific implementation in the first
> > place?
>
> I thought, but I might be mistaken, that Emacs allow changing these
> character properties at runtime and someone might possibly want to use
> that to change some character property (e.g. make some PUA character a
> combining mark) and it would then be nice if HarfBuzz respected that. I
> admit that is very niche thing if possible at all, and I’m more than
> happy to let HarfBuzz use it default Unicode functions and simplify the
> Emacs integration code.
Right, I agree that we should for now leave that to HarfBuzz. It
could be added later as an optional feature. (I don't expect many
users to want to modify the Unicode character properties.)
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-23 16:00 ` Eli Zaretskii
@ 2018-12-24 2:08 ` Khaled Hosny
2018-12-24 4:12 ` Kaushal Modi
2018-12-24 16:10 ` Eli Zaretskii
0 siblings, 2 replies; 55+ messages in thread
From: Khaled Hosny @ 2018-12-24 2:08 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
On Sun, Dec 23, 2018 at 06:00:58PM +0200, Eli Zaretskii wrote:
> > Date: Sun, 23 Dec 2018 15:51:09 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > > Are there any disadvantages in using the built-in support? I mean,
> > > why did you envision an Emacs-specific implementation in the first
> > > place?
> >
> > I thought, but I might be mistaken, that Emacs allow changing these
> > character properties at runtime and someone might possibly want to use
> > that to change some character property (e.g. make some PUA character a
> > combining mark) and it would then be nice if HarfBuzz respected that. I
> > admit that is very niche thing if possible at all, and I’m more than
> > happy to let HarfBuzz use it default Unicode functions and simplify the
> > Emacs integration code.
>
> Right, I agree that we should for now leave that to HarfBuzz. It
> could be added later as an optional feature. (I don't expect many
> users to want to modify the Unicode character properties.)
I think we are almost good now. There is only one serious FIXME left:
/* FIXME: guess_segment_properties is BAD BAD BAD.
* we need to get these properties with the LGSTRING. */
#if 1
hb_buffer_guess_segment_properties (hb_buffer);
#else
hb_buffer_set_direction (hb_buffer, XXX);
hb_buffer_set_script (hb_buffer, XXX);
hb_buffer_set_language (hb_buffer, XXX);
#endif
We need to know, for a given lgstring we are shaping:
* Its direction (from applying bidi algorithm). Each lgstring we are
shaping must be of a single direction.
* Its script, possibly after applying something like:
http://unicode.org/reports/tr24/#Common
* Its language, is Emacs allows setting text language (my understand is
that it doesn’t). Some languages really need this for applying
language-specfic features (Urdu digits, Serbian alternate glyphs, etc.).
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-24 2:08 ` Khaled Hosny
@ 2018-12-24 4:12 ` Kaushal Modi
2018-12-24 16:10 ` Eli Zaretskii
1 sibling, 0 replies; 55+ messages in thread
From: Kaushal Modi @ 2018-12-24 4:12 UTC (permalink / raw)
To: Khaled Hosny; +Cc: Behdad Esfahbod, 33729, Mohammad Nasirifar
[-- Attachment #1: Type: text/plain, Size: 421 bytes --]
On Sun, Dec 23, 2018 at 9:08 PM Khaled Hosny <dr.khaled.hosny@gmail.com>
wrote:
> I think we are almost good now.
>
Thanks for working on this!
I confirm that this particular issue related to rendering compound glyphs
in Gujarati script is fixed.
Proof: https://i.imgtc.com/XYSM5fE.png
@Eli I see that discussion related to this fix is on-going. So feel free to
mark this bug as DONE when you see fit.
Thanks all!
[-- Attachment #2: Type: text/html, Size: 911 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-24 2:08 ` Khaled Hosny
2018-12-24 4:12 ` Kaushal Modi
@ 2018-12-24 16:10 ` Eli Zaretskii
2018-12-24 17:37 ` Khaled Hosny
1 sibling, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-24 16:10 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
> Date: Mon, 24 Dec 2018 04:08:47 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> I think we are almost good now. There is only one serious FIXME left:
>
> /* FIXME: guess_segment_properties is BAD BAD BAD.
> * we need to get these properties with the LGSTRING. */
> #if 1
> hb_buffer_guess_segment_properties (hb_buffer);
> #else
> hb_buffer_set_direction (hb_buffer, XXX);
> hb_buffer_set_script (hb_buffer, XXX);
> hb_buffer_set_language (hb_buffer, XXX);
> #endif
>
> We need to know, for a given lgstring we are shaping:
> * Its direction (from applying bidi algorithm). Each lgstring we are
> shaping must be of a single direction.
Communicating this to ftfont_shape_by_hb will need changes in a couple
of interfaces (the existing shaping engines didn't need this
information). I will work on this soon.
> * Its script, possibly after applying something like:
> http://unicode.org/reports/tr24/#Common
Per previous discussions, we decided to use the Harfbuzz built-in
methods for determining the script, since Emacs doesn't have this
information, and adding it will just do the same as Harfbuzz does,
i.e. find the first character whose script is not Common etc., using
the UCD database. I think it was you who suggested to use the
Harfbuzz built-ins in this case.
> * Its language, is Emacs allows setting text language (my understand is
> that it doesn’t). Some languages really need this for applying
> language-specfic features (Urdu digits, Serbian alternate glyphs, etc.).
We don't currently have a language property for chunks of text, we
only have the current global language setting determined from the
locale (and there's a command to change that for Emacs, should the
user want it). This is not really appropriate for multilingual
buffers, but we will have to use that for now, and hope that in the
future, infrastructure will be added to allow more flexible
determination of the language of each run of text. (I see that
Harfbuzz already looks a the locale for its default language, but
since Emacs allows user control of this, however unlikely, I think
it's best to use the value Emacs uses.) I will work on this as well.
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-24 16:10 ` Eli Zaretskii
@ 2018-12-24 17:37 ` Khaled Hosny
2018-12-24 18:07 ` Eli Zaretskii
2018-12-29 14:49 ` Eli Zaretskii
0 siblings, 2 replies; 55+ messages in thread
From: Khaled Hosny @ 2018-12-24 17:37 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
On Mon, Dec 24, 2018 at 06:10:49PM +0200, Eli Zaretskii wrote:
> > Date: Mon, 24 Dec 2018 04:08:47 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > I think we are almost good now. There is only one serious FIXME left:
> >
> > /* FIXME: guess_segment_properties is BAD BAD BAD.
> > * we need to get these properties with the LGSTRING. */
> > #if 1
> > hb_buffer_guess_segment_properties (hb_buffer);
> > #else
> > hb_buffer_set_direction (hb_buffer, XXX);
> > hb_buffer_set_script (hb_buffer, XXX);
> > hb_buffer_set_language (hb_buffer, XXX);
> > #endif
> >
> > We need to know, for a given lgstring we are shaping:
> > * Its direction (from applying bidi algorithm). Each lgstring we are
> > shaping must be of a single direction.
>
> Communicating this to ftfont_shape_by_hb will need changes in a couple
> of interfaces (the existing shaping engines didn't need this
> information). I will work on this soon.
Great.
> > * Its script, possibly after applying something like:
> > http://unicode.org/reports/tr24/#Common
>
> Per previous discussions, we decided to use the Harfbuzz built-in
> methods for determining the script, since Emacs doesn't have this
> information, and adding it will just do the same as Harfbuzz does,
> i.e. find the first character whose script is not Common etc., using
> the UCD database. I think it was you who suggested to use the
> Harfbuzz built-ins in this case.
The built-in HarfBuzz code is for getting the script for a given
character, but resolving characters with Common script is left to the
client. Suppose you have this string (upper case for RTL) ABC 123 DEF,
what HarfBuzz sees during shaping is three separate chunks of text ABC,
123, DEF. The 123 part is all Common script characters and thus
hb_buffer_guess_segment_properties won’t be able to guess anything (and
based on the font and the script, this can cause rendering differences).
Emacs will have to resolve the script of Common characters before
applying bidi algorithm and pass that down to HarfBuzz.
> > * Its language, is Emacs allows setting text language (my understand is
> > that it doesn’t). Some languages really need this for applying
> > language-specfic features (Urdu digits, Serbian alternate glyphs, etc.).
>
> We don't currently have a language property for chunks of text, we
> only have the current global language setting determined from the
> locale (and there's a command to change that for Emacs, should the
> user want it). This is not really appropriate for multilingual
> buffers, but we will have to use that for now, and hope that in the
> future, infrastructure will be added to allow more flexible
> determination of the language of each run of text. (I see that
> Harfbuzz already looks a the locale for its default language, but
> since Emacs allows user control of this, however unlikely, I think
> it's best to use the value Emacs uses.) I will work on this as well.
Yes, better pass that from Emacs to HarfBuzz.
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-22 9:06 ` Khaled Hosny
2018-12-22 10:11 ` Eli Zaretskii
@ 2018-12-24 17:38 ` Benjamin Riefenstahl
1 sibling, 0 replies; 55+ messages in thread
From: Benjamin Riefenstahl @ 2018-12-24 17:38 UTC (permalink / raw)
To: Khaled Hosny; +Cc: far.nasiri.m, behdad, 33729, kaushal.modi
Khaled Hosny writes:
> I pushed a commit to harfbuzz branch that I think fixes this issue now.
I can confirm that this fixes the issue with Syriac.
Thanks,
benny
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-24 17:37 ` Khaled Hosny
@ 2018-12-24 18:07 ` Eli Zaretskii
2019-01-05 21:15 ` Khaled Hosny
2018-12-29 14:49 ` Eli Zaretskii
1 sibling, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-24 18:07 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
> Date: Mon, 24 Dec 2018 19:37:23 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> > Per previous discussions, we decided to use the Harfbuzz built-in
> > methods for determining the script, since Emacs doesn't have this
> > information, and adding it will just do the same as Harfbuzz does,
> > i.e. find the first character whose script is not Common etc., using
> > the UCD database. I think it was you who suggested to use the
> > Harfbuzz built-ins in this case.
>
> The built-in HarfBuzz code is for getting the script for a given
> character, but resolving characters with Common script is left to the
> client. Suppose you have this string (upper case for RTL) ABC 123 DEF,
> what HarfBuzz sees during shaping is three separate chunks of text ABC,
> 123, DEF. The 123 part is all Common script characters and thus
> hb_buffer_guess_segment_properties won’t be able to guess anything (and
> based on the font and the script, this can cause rendering differences).
> Emacs will have to resolve the script of Common characters before
> applying bidi algorithm and pass that down to HarfBuzz.
I'm not sure I understand: why does HarfBuzz care that 123 was in the
middle if RTL text. Does it need to shape 123 specially in this case?
(In general, AFAIK simple characters like 123 will not even go through
HarfBuzz, as Emacs doesn't call the shaper for characters whose entry
in composition-function-table is nil. So I guess 123 here should
stand for some other characters, not for literal digits? IOW, I don't
think I understand the example very well.)
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-24 17:37 ` Khaled Hosny
2018-12-24 18:07 ` Eli Zaretskii
@ 2018-12-29 14:49 ` Eli Zaretskii
2019-01-05 20:53 ` Khaled Hosny
1 sibling, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2018-12-29 14:49 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, far.nasiri.m, 33729
> Date: Mon, 24 Dec 2018 19:37:23 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> On Mon, Dec 24, 2018 at 06:10:49PM +0200, Eli Zaretskii wrote:
> > > Date: Mon, 24 Dec 2018 04:08:47 +0200
> > > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > >
> > > I think we are almost good now. There is only one serious FIXME left:
> > >
> > > /* FIXME: guess_segment_properties is BAD BAD BAD.
> > > * we need to get these properties with the LGSTRING. */
> > > #if 1
> > > hb_buffer_guess_segment_properties (hb_buffer);
> > > #else
> > > hb_buffer_set_direction (hb_buffer, XXX);
> > > hb_buffer_set_script (hb_buffer, XXX);
> > > hb_buffer_set_language (hb_buffer, XXX);
> > > #endif
> > >
> > > We need to know, for a given lgstring we are shaping:
> > > * Its direction (from applying bidi algorithm). Each lgstring we are
> > > shaping must be of a single direction.
> >
> > Communicating this to ftfont_shape_by_hb will need changes in a couple
> > of interfaces (the existing shaping engines didn't need this
> > information). I will work on this soon.
>
> Great.
Done. Please test. I made sure it compiles, but I couldn't actually
test the results, as I don't have access to a GNU/Linux system with
GUI display. So it could be that I misunderstood the Harfbuzz APIs,
as I was essentially flying blind, guided only by the Harfbuzz docs.
In particularly, I hope I understood correctly the way we should leave
to Harfbuzz guess the properties not explicitly provided by the Emacs
context, both for the direction of the text and its script.
> > > * Its script, possibly after applying something like:
> > > http://unicode.org/reports/tr24/#Common
> >
> > Per previous discussions, we decided to use the Harfbuzz built-in
> > methods for determining the script, since Emacs doesn't have this
> > information, and adding it will just do the same as Harfbuzz does,
> > i.e. find the first character whose script is not Common etc., using
> > the UCD database. I think it was you who suggested to use the
> > Harfbuzz built-ins in this case.
>
> The built-in HarfBuzz code is for getting the script for a given
> character, but resolving characters with Common script is left to the
> client. Suppose you have this string (upper case for RTL) ABC 123 DEF,
> what HarfBuzz sees during shaping is three separate chunks of text ABC,
> 123, DEF. The 123 part is all Common script characters and thus
> hb_buffer_guess_segment_properties won’t be able to guess anything (and
> based on the font and the script, this can cause rendering differences).
> Emacs will have to resolve the script of Common characters before
> applying bidi algorithm and pass that down to HarfBuzz.
See my followup questions about this. For now, I left this aspect to
HarfBuzz.
> > > * Its language, is Emacs allows setting text language (my understand is
> > > that it doesn’t). Some languages really need this for applying
> > > language-specfic features (Urdu digits, Serbian alternate glyphs, etc.).
> >
> > We don't currently have a language property for chunks of text, we
> > only have the current global language setting determined from the
> > locale (and there's a command to change that for Emacs, should the
> > user want it). This is not really appropriate for multilingual
> > buffers, but we will have to use that for now, and hope that in the
> > future, infrastructure will be added to allow more flexible
> > determination of the language of each run of text. (I see that
> > Harfbuzz already looks a the locale for its default language, but
> > since Emacs allows user control of this, however unlikely, I think
> > it's best to use the value Emacs uses.) I will work on this as well.
>
> Yes, better pass that from Emacs to HarfBuzz.
Done, but please see the FIXME I left behind. For testing purposes,
you can change the current language like this:
M-x set-locale-environment RET xx_YY.CODESET RET
For example:
M-x set-locale-environment RET sr_RS.UTF-8 RET
for the Cyrillic Serbian locale. This should change the value of
current-iso639-language to the symbol 'sr'.
Please tell if you encounter any difficulties with the code I added,
or if you need any further help.
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-29 14:49 ` Eli Zaretskii
@ 2019-01-05 20:53 ` Khaled Hosny
2019-01-05 21:04 ` Khaled Hosny
` (2 more replies)
0 siblings, 3 replies; 55+ messages in thread
From: Khaled Hosny @ 2019-01-05 20:53 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, far.nasiri.m, 33729
On Sat, Dec 29, 2018 at 04:49:23PM +0200, Eli Zaretskii wrote:
> > Date: Mon, 24 Dec 2018 19:37:23 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > > > We need to know, for a given lgstring we are shaping:
> > > > * Its direction (from applying bidi algorithm). Each lgstring we are
> > > > shaping must be of a single direction.
> > >
> > > Communicating this to ftfont_shape_by_hb will need changes in a couple
> > > of interfaces (the existing shaping engines didn't need this
> > > information). I will work on this soon.
> >
> > Great.
>
> Done. Please test. I made sure it compiles, but I couldn't actually
> test the results, as I don't have access to a GNU/Linux system with
> GUI display. So it could be that I misunderstood the Harfbuzz APIs,
> as I was essentially flying blind, guided only by the Harfbuzz docs.
It seems to work, but still not quite right. You seem to be passing the
paragraph direction, but what HarfBuzz needs is resolved direction of
the text (i.e. the bidi embedding level of the run). In other words, if
Emacs is going to draw this text from right to left, then HarfBuzz must
shape it in right to left direction. Both should use the same direction
all the time and HarfBuzz direction guessing should never be used (i.e.
always pass to it an explicit direction).
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-05 20:53 ` Khaled Hosny
@ 2019-01-05 21:04 ` Khaled Hosny
2019-01-06 17:54 ` Eli Zaretskii
2019-01-06 15:50 ` Eli Zaretskii
2019-01-27 17:09 ` Eli Zaretskii
2 siblings, 1 reply; 55+ messages in thread
From: Khaled Hosny @ 2019-01-05 21:04 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, far.nasiri.m, 33729
On Sat, Jan 05, 2019 at 10:53:14PM +0200, Khaled Hosny wrote:
> On Sat, Dec 29, 2018 at 04:49:23PM +0200, Eli Zaretskii wrote:
> > > Date: Mon, 24 Dec 2018 19:37:23 +0200
> > > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> > >
> > > > > We need to know, for a given lgstring we are shaping:
> > > > > * Its direction (from applying bidi algorithm). Each lgstring we are
> > > > > shaping must be of a single direction.
> > > >
> > > > Communicating this to ftfont_shape_by_hb will need changes in a couple
> > > > of interfaces (the existing shaping engines didn't need this
> > > > information). I will work on this soon.
> > >
> > > Great.
> >
> > Done. Please test. I made sure it compiles, but I couldn't actually
> > test the results, as I don't have access to a GNU/Linux system with
> > GUI display. So it could be that I misunderstood the Harfbuzz APIs,
> > as I was essentially flying blind, guided only by the Harfbuzz docs.
>
> It seems to work, but still not quite right. You seem to be passing the
> paragraph direction, but what HarfBuzz needs is resolved direction of
> the text (i.e. the bidi embedding level of the run). In other words, if
> Emacs is going to draw this text from right to left, then HarfBuzz must
> shape it in right to left direction. Both should use the same direction
> all the time and HarfBuzz direction guessing should never be used (i.e.
> always pass to it an explicit direction).
I pushed a couple of commits that does this based on my limited
understanding of Emacs code, please check.
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2018-12-24 18:07 ` Eli Zaretskii
@ 2019-01-05 21:15 ` Khaled Hosny
2019-01-06 16:03 ` Eli Zaretskii
0 siblings, 1 reply; 55+ messages in thread
From: Khaled Hosny @ 2019-01-05 21:15 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m, kaushal.modi
On Mon, Dec 24, 2018 at 08:07:04PM +0200, Eli Zaretskii wrote:
> > Date: Mon, 24 Dec 2018 19:37:23 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > > Per previous discussions, we decided to use the Harfbuzz built-in
> > > methods for determining the script, since Emacs doesn't have this
> > > information, and adding it will just do the same as Harfbuzz does,
> > > i.e. find the first character whose script is not Common etc., using
> > > the UCD database. I think it was you who suggested to use the
> > > Harfbuzz built-ins in this case.
> >
> > The built-in HarfBuzz code is for getting the script for a given
> > character, but resolving characters with Common script is left to the
> > client. Suppose you have this string (upper case for RTL) ABC 123 DEF,
> > what HarfBuzz sees during shaping is three separate chunks of text ABC,
> > 123, DEF. The 123 part is all Common script characters and thus
> > hb_buffer_guess_segment_properties won’t be able to guess anything (and
> > based on the font and the script, this can cause rendering differences).
> > Emacs will have to resolve the script of Common characters before
> > applying bidi algorithm and pass that down to HarfBuzz.
>
> I'm not sure I understand: why does HarfBuzz care that 123 was in the
> middle if RTL text.
It doesn’t. What it cares about here is the correct script. Because 123
are in the middle of RTL text they will be shaped separately, and thus
hb_buffer_guess_segment_properties() will only see 123 and won’t to be
able to guess the correct script for them (Arabic, Hebrew, etc.,
whatever the script for the surrounding RTL text is).
The point I’m trying to make is that script detection, even in its
simplest form, needs to be done on the text as a whole not just the
portion being shaped, which makes hb_buffer_guess_segment_properties()
ill equipped for doing this as it only sees a small portion of the text
at a time.
> Does it need to shape 123 specially in this case?
Depending on the font, the digits might be shaped differently if the
script is, say Arabic, by e.g. applying script-specific substitutions to
forms more suitable for a given script.
> (In general, AFAIK simple characters like 123 will not even go through
> HarfBuzz, as Emacs doesn't call the shaper for characters whose entry
> in composition-function-table is nil. So I guess 123 here should
> stand for some other characters, not for literal digits? IOW, I don't
> think I understand the example very well.)
This is a bug then and needs to be fixed. All text should go through
HarfBuzz since even so-called “simple” character often require shaping
depending on the text and the font. If this is done for optimization,
then it should be revised to see if shaping with HarfBuzz is actually
significantly slower and if it is, find more proper ways to optimize it.
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-05 20:53 ` Khaled Hosny
2019-01-05 21:04 ` Khaled Hosny
@ 2019-01-06 15:50 ` Eli Zaretskii
2019-01-29 22:29 ` Khaled Hosny
2019-01-27 17:09 ` Eli Zaretskii
2 siblings, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2019-01-06 15:50 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, far.nasiri.m, 33729
> Date: Sat, 5 Jan 2019 22:53:14 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: far.nasiri.m@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org
>
> > Done. Please test. I made sure it compiles, but I couldn't actually
> > test the results, as I don't have access to a GNU/Linux system with
> > GUI display. So it could be that I misunderstood the Harfbuzz APIs,
> > as I was essentially flying blind, guided only by the Harfbuzz docs.
>
> It seems to work, but still not quite right. You seem to be passing the
> paragraph direction, but what HarfBuzz needs is resolved direction of
> the text (i.e. the bidi embedding level of the run).
It isn't the paragraph direction; at least it wasn't supposed to be
that. The code is (or was before your changes):
if (charpos < endpos)
{
if (pdir == L2R)
direction = QL2R;
else if (pdir == R2L)
direction = QR2L;
[...]
cmp_it->reversed_p = 0;
}
else
{
[...]
cmp_it->reversed_p = 1;
[...]
if (pdir == L2R)
direction = QR2L;
else if (pdir == R2L)
direction = QL2R;
[...]
}
So, as you see, when the paragraph direction is L2R, normal text gets
L2R direction, while test reversed for display gets R2L, and the other
way around when the paragraph direction is R2L. Which AFAIU is what
HarfBuzz needs, but maybe I'm missing something.
Did you actually see incorrect display with the code I wrote? If so,
could you please show the recipes for reproducing that, preferably
with screenshots of correct and incorrect display? I'd like to look
into that, to understand what I missed.
> HarfBuzz direction guessing should never be used (i.e. always pass
> to it an explicit direction).
This is in general impossible (or at least very hard), since the
shaper is sometimes called from Lisp without any display context. See
the Lisp callers of the function font-shape-gstring. One use case is
when we want to display the composition information for a grapheme
cluster to the user, see descr-text.el (used by the "C-u C-x ="
command). In these cases, the UBA is not invoked, and so we don't
have the direction information.
I could provide the direction information in this case by using the
directionality of the base character of the grapheme cluster, but I
figured out that HarfBuzz already does this as part of its guessing.
Doesn't it?
> I pushed a couple of commits that does this based on my limited
> understanding of Emacs code, please check.
Thanks. Do you see any difference in the results? If so, can you
please show the text you used and the results of shaping it with both
versions. AFAIU, your code should produce exactly the same results,
unless I'm missing something. (I didn't want to use the
resolved_level attribute because it is ephemeral, and might not
provide the correct value where we are using it.)
Btw, did you test both paragraph directions (controlled by the
bidi-paragraph-direction variable), and also text inside directional
override which changes its natural direction?
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-05 21:15 ` Khaled Hosny
@ 2019-01-06 16:03 ` Eli Zaretskii
2019-01-27 17:12 ` Eli Zaretskii
0 siblings, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2019-01-06 16:03 UTC (permalink / raw)
To: Khaled Hosny, Kenichi Handa; +Cc: behdad, far.nasiri.m, 33729
> Date: Sat, 5 Jan 2019 23:15:14 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> 33729@debbugs.gnu.org, kaushal.modi@gmail.com
>
> > > The built-in HarfBuzz code is for getting the script for a given
> > > character, but resolving characters with Common script is left to the
> > > client. Suppose you have this string (upper case for RTL) ABC 123 DEF,
> > > what HarfBuzz sees during shaping is three separate chunks of text ABC,
> > > 123, DEF. The 123 part is all Common script characters and thus
> > > hb_buffer_guess_segment_properties won’t be able to guess anything (and
> > > based on the font and the script, this can cause rendering differences).
> > > Emacs will have to resolve the script of Common characters before
> > > applying bidi algorithm and pass that down to HarfBuzz.
> >
> > I'm not sure I understand: why does HarfBuzz care that 123 was in the
> > middle if RTL text.
>
> It doesn’t. What it cares about here is the correct script. Because 123
> are in the middle of RTL text they will be shaped separately, and thus
> hb_buffer_guess_segment_properties() will only see 123 and won’t to be
> able to guess the correct script for them (Arabic, Hebrew, etc.,
> whatever the script for the surrounding RTL text is).
That's what I was asking: why it's important for HarfBuzz to know that
123 should be shaped for the Arabic script?
> Depending on the font, the digits might be shaped differently if the
> script is, say Arabic, by e.g. applying script-specific substitutions to
> forms more suitable for a given script.
I guess this is what I'm missing, then: these script-specific
substitutions. Can you elaborate on that, or point to some place
where these substitutions are described in detail?
> > (In general, AFAIK simple characters like 123 will not even go through
> > HarfBuzz, as Emacs doesn't call the shaper for characters whose entry
> > in composition-function-table is nil. So I guess 123 here should
> > stand for some other characters, not for literal digits? IOW, I don't
> > think I understand the example very well.)
>
> This is a bug then and needs to be fixed. All text should go through
> HarfBuzz since even so-called “simple” character often require shaping
> depending on the text and the font. If this is done for optimization,
> then it should be revised to see if shaping with HarfBuzz is actually
> significantly slower and if it is, find more proper ways to optimize it.
(Adding Handa-san to the discussion, in the hope that he could comment
on the issue.)
I think running all text through a shaper might be prohibitively
expensive, because the shaper is called through Lisp code (see
composite.el), and we decide which chunk of text to pass to the shaper
using regexp search. See the various files under lisp/language/ which
set up portions of composition-function-table as appropriate for each
language that needs it.
So I think we should identify all the cases where "simple" characters
surrounded by, or adjacent to, "non-simple" ones need to be passed to
a shaper, and add the necessary regular expressions to the data
structures in lisp/languages/. Can you describe these cases, or point
me to a place where I can find the relevant info?
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-05 21:04 ` Khaled Hosny
@ 2019-01-06 17:54 ` Eli Zaretskii
2019-01-27 17:12 ` Eli Zaretskii
2019-01-29 22:33 ` Khaled Hosny
0 siblings, 2 replies; 55+ messages in thread
From: Eli Zaretskii @ 2019-01-06 17:54 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, far.nasiri.m, 33729
> Date: Sat, 5 Jan 2019 23:04:20 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: far.nasiri.m@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org
>
> I pushed a couple of commits that does this based on my limited
> understanding of Emacs code, please check.
Can you explain why you moved the call to
hb_buffer_guess_segment_properties _after_ the code which sets some of
the properties? I cannot find anything about that in the HarfBuzz
documentation. Is this because guessing the unset properties can
benefit from knowing the properties which _are_ set, such as the
direction?
I did it the other way around, because my mental model was: first set
the defaults, then override them where better info is available.
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-05 20:53 ` Khaled Hosny
2019-01-05 21:04 ` Khaled Hosny
2019-01-06 15:50 ` Eli Zaretskii
@ 2019-01-27 17:09 ` Eli Zaretskii
2 siblings, 0 replies; 55+ messages in thread
From: Eli Zaretskii @ 2019-01-27 17:09 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, far.nasiri.m, 33729
> Date: Sat, 5 Jan 2019 22:53:14 +0200
> From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> Cc: far.nasiri.m@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org
>
> > Done. Please test. I made sure it compiles, but I couldn't actually
> > test the results, as I don't have access to a GNU/Linux system with
> > GUI display. So it could be that I misunderstood the Harfbuzz APIs,
> > as I was essentially flying blind, guided only by the Harfbuzz docs.
>
> It seems to work, but still not quite right. You seem to be passing the
> paragraph direction, but what HarfBuzz needs is resolved direction of
> the text (i.e. the bidi embedding level of the run). In other words, if
> Emacs is going to draw this text from right to left, then HarfBuzz must
> shape it in right to left direction. Both should use the same direction
> all the time and HarfBuzz direction guessing should never be used (i.e.
> always pass to it an explicit direction).
In response to that, I wrote:
It isn't the paragraph direction; at least it wasn't supposed to be
that. The code is (or was before your changes):
if (charpos < endpos)
{
if (pdir == L2R)
direction = QL2R;
else if (pdir == R2L)
direction = QR2L;
[...]
cmp_it->reversed_p = 0;
}
else
{
[...]
cmp_it->reversed_p = 1;
[...]
if (pdir == L2R)
direction = QR2L;
else if (pdir == R2L)
direction = QL2R;
[...]
}
So, as you see, when the paragraph direction is L2R, normal text gets
L2R direction, while test reversed for display gets R2L, and the other
way around when the paragraph direction is R2L. Which AFAIU is what
HarfBuzz needs, but maybe I'm missing something.
Did you actually see incorrect display with the code I wrote? If so,
could you please show the recipes for reproducing that, preferably
with screenshots of correct and incorrect display? I'd like to look
into that, to understand what I missed.
> HarfBuzz direction guessing should never be used (i.e. always pass
> to it an explicit direction).
This is in general impossible (or at least very hard), since the
shaper is sometimes called from Lisp without any display context. See
the Lisp callers of the function font-shape-gstring. One use case is
when we want to display the composition information for a grapheme
cluster to the user, see descr-text.el (used by the "C-u C-x ="
command). In these cases, the UBA is not invoked, and so we don't
have the direction information.
I could provide the direction information in this case by using the
directionality of the base character of the grapheme cluster, but I
figured out that HarfBuzz already does this as part of its guessing.
Doesn't it?
> I pushed a couple of commits that does this based on my limited
> understanding of Emacs code, please check.
Thanks. Do you see any difference in the results? If so, can you
please show the text you used and the results of shaping it with both
versions. AFAIU, your code should produce exactly the same results,
unless I'm missing something. (I didn't want to use the
resolved_level attribute because it is ephemeral, and might not
provide the correct value where we are using it.)
Btw, did you test both paragraph directions (controlled by the
bidi-paragraph-direction variable), and also text inside directional
override which changes its natural direction?
Could you please respond and answer the few questions I asked? I'd
like us to continue working on the branch.
TIA
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-06 16:03 ` Eli Zaretskii
@ 2019-01-27 17:12 ` Eli Zaretskii
2019-01-29 22:25 ` Khaled Hosny
0 siblings, 1 reply; 55+ messages in thread
From: Eli Zaretskii @ 2019-01-27 17:12 UTC (permalink / raw)
To: dr.khaled.hosny; +Cc: behdad, 33729, far.nasiri.m
Could you please respond to the below as well?
> Date: Sun, 06 Jan 2019 18:03:55 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: behdad@behdad.org, far.nasiri.m@gmail.com, 33729@debbugs.gnu.org
>
> > Date: Sat, 5 Jan 2019 23:15:14 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org,
> > 33729@debbugs.gnu.org, kaushal.modi@gmail.com
> >
> > > > The built-in HarfBuzz code is for getting the script for a given
> > > > character, but resolving characters with Common script is left to the
> > > > client. Suppose you have this string (upper case for RTL) ABC 123 DEF,
> > > > what HarfBuzz sees during shaping is three separate chunks of text ABC,
> > > > 123, DEF. The 123 part is all Common script characters and thus
> > > > hb_buffer_guess_segment_properties won’t be able to guess anything (and
> > > > based on the font and the script, this can cause rendering differences).
> > > > Emacs will have to resolve the script of Common characters before
> > > > applying bidi algorithm and pass that down to HarfBuzz.
> > >
> > > I'm not sure I understand: why does HarfBuzz care that 123 was in the
> > > middle if RTL text.
> >
> > It doesn’t. What it cares about here is the correct script. Because 123
> > are in the middle of RTL text they will be shaped separately, and thus
> > hb_buffer_guess_segment_properties() will only see 123 and won’t to be
> > able to guess the correct script for them (Arabic, Hebrew, etc.,
> > whatever the script for the surrounding RTL text is).
>
> That's what I was asking: why it's important for HarfBuzz to know that
> 123 should be shaped for the Arabic script?
>
> > Depending on the font, the digits might be shaped differently if the
> > script is, say Arabic, by e.g. applying script-specific substitutions to
> > forms more suitable for a given script.
>
> I guess this is what I'm missing, then: these script-specific
> substitutions. Can you elaborate on that, or point to some place
> where these substitutions are described in detail?
>
> > > (In general, AFAIK simple characters like 123 will not even go through
> > > HarfBuzz, as Emacs doesn't call the shaper for characters whose entry
> > > in composition-function-table is nil. So I guess 123 here should
> > > stand for some other characters, not for literal digits? IOW, I don't
> > > think I understand the example very well.)
> >
> > This is a bug then and needs to be fixed. All text should go through
> > HarfBuzz since even so-called “simple” character often require shaping
> > depending on the text and the font. If this is done for optimization,
> > then it should be revised to see if shaping with HarfBuzz is actually
> > significantly slower and if it is, find more proper ways to optimize it.
>
> (Adding Handa-san to the discussion, in the hope that he could comment
> on the issue.)
>
> I think running all text through a shaper might be prohibitively
> expensive, because the shaper is called through Lisp code (see
> composite.el), and we decide which chunk of text to pass to the shaper
> using regexp search. See the various files under lisp/language/ which
> set up portions of composition-function-table as appropriate for each
> language that needs it.
>
> So I think we should identify all the cases where "simple" characters
> surrounded by, or adjacent to, "non-simple" ones need to be passed to
> a shaper, and add the necessary regular expressions to the data
> structures in lisp/languages/. Can you describe these cases, or point
> me to a place where I can find the relevant info?
>
> Thanks.
>
>
>
>
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-06 17:54 ` Eli Zaretskii
@ 2019-01-27 17:12 ` Eli Zaretskii
2019-01-29 22:33 ` Khaled Hosny
1 sibling, 0 replies; 55+ messages in thread
From: Eli Zaretskii @ 2019-01-27 17:12 UTC (permalink / raw)
To: dr.khaled.hosny; +Cc: behdad, far.nasiri.m, 33729
> Date: Sun, 06 Jan 2019 19:54:24 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: behdad@behdad.org, far.nasiri.m@gmail.com, 33729@debbugs.gnu.org
>
> > Date: Sat, 5 Jan 2019 23:04:20 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: far.nasiri.m@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org
> >
> > I pushed a couple of commits that does this based on my limited
> > understanding of Emacs code, please check.
>
> Can you explain why you moved the call to
> hb_buffer_guess_segment_properties _after_ the code which sets some of
> the properties? I cannot find anything about that in the HarfBuzz
> documentation. Is this because guessing the unset properties can
> benefit from knowing the properties which _are_ set, such as the
> direction?
>
> I did it the other way around, because my mental model was: first set
> the defaults, then override them where better info is available.
>
> Thanks.
Please respond.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-27 17:12 ` Eli Zaretskii
@ 2019-01-29 22:25 ` Khaled Hosny
0 siblings, 0 replies; 55+ messages in thread
From: Khaled Hosny @ 2019-01-29 22:25 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, 33729, far.nasiri.m
On Sun, Jan 27, 2019 at 07:12:04PM +0200, Eli Zaretskii wrote:
> Could you please respond to the below as well?
I have no time for angering these questions any more, sorry. Please feel
free to do what you find sensible.
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-06 15:50 ` Eli Zaretskii
@ 2019-01-29 22:29 ` Khaled Hosny
2022-04-29 12:47 ` Lars Ingebrigtsen
0 siblings, 1 reply; 55+ messages in thread
From: Khaled Hosny @ 2019-01-29 22:29 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, far.nasiri.m, 33729
On Sun, Jan 06, 2019 at 05:50:54PM +0200, Eli Zaretskii wrote:
> > Date: Sat, 5 Jan 2019 22:53:14 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > I pushed a couple of commits that does this based on my limited
> > understanding of Emacs code, please check.
>
> Thanks. Do you see any difference in the results?
Strings with forced direction (e.g. Arabic with LRO) showed difference.
Without my change they were shaped RTL by drawn LTR, with my change
shaping and drawing used LTR direction. Please feel free to revert that
change if you think it is incorrect.
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-06 17:54 ` Eli Zaretskii
2019-01-27 17:12 ` Eli Zaretskii
@ 2019-01-29 22:33 ` Khaled Hosny
1 sibling, 0 replies; 55+ messages in thread
From: Khaled Hosny @ 2019-01-29 22:33 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: behdad, far.nasiri.m, 33729
On Sun, Jan 06, 2019 at 07:54:24PM +0200, Eli Zaretskii wrote:
> > Date: Sat, 5 Jan 2019 23:04:20 +0200
> > From: Khaled Hosny <dr.khaled.hosny@gmail.com>
> > Cc: far.nasiri.m@gmail.com, behdad@behdad.org, 33729@debbugs.gnu.org
> >
> > I pushed a couple of commits that does this based on my limited
> > understanding of Emacs code, please check.
>
> Can you explain why you moved the call to
> hb_buffer_guess_segment_properties _after_ the code which sets some of
> the properties? I cannot find anything about that in the HarfBuzz
> documentation. Is this because guessing the unset properties can
> benefit from knowing the properties which _are_ set, such as the
> direction?
hb_buffer_guess_segment_properties() won’t guess set properties, so moving
it last was to avoid wasting time guessing properties that we will
override later anyway.
> I did it the other way around, because my mental model was: first set
> the defaults, then override them where better info is available.
hb_buffer_guess_segment_properties() is not for setting defaults (there
is no such thing as default buffer properties in HarfBuzz working
model), it is a kind of quick and dirty hack and production code should
not use it.
Regards,
Khaled
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2019-01-29 22:29 ` Khaled Hosny
@ 2022-04-29 12:47 ` Lars Ingebrigtsen
2022-04-29 13:24 ` Eli Zaretskii
0 siblings, 1 reply; 55+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-29 12:47 UTC (permalink / raw)
To: Khaled Hosny; +Cc: behdad, far.nasiri.m, 33729
Khaled Hosny <dr.khaled.hosny@gmail.com> writes:
>> Thanks. Do you see any difference in the results?
>
> Strings with forced direction (e.g. Arabic with LRO) showed difference.
> Without my change they were shaped RTL by drawn LTR, with my change
> shaping and drawing used LTR direction. Please feel free to revert that
> change if you think it is incorrect.
(I'm going through old bug reports that unfortunately weren't resolved
at the time.)
Skimming this long bug report, it seems like the fixes Khaled pushed
fixed the reported issue (but I may well be misreading).
Eli, is there anything more to do here?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 55+ messages in thread
* bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
2022-04-29 12:47 ` Lars Ingebrigtsen
@ 2022-04-29 13:24 ` Eli Zaretskii
0 siblings, 0 replies; 55+ messages in thread
From: Eli Zaretskii @ 2022-04-29 13:24 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: dr.khaled.hosny, behdad, far.nasiri.m, 33729-done
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Eli Zaretskii <eliz@gnu.org>, behdad@behdad.org,
> far.nasiri.m@gmail.com, 33729@debbugs.gnu.org
> Date: Fri, 29 Apr 2022 14:47:55 +0200
>
> Khaled Hosny <dr.khaled.hosny@gmail.com> writes:
>
> >> Thanks. Do you see any difference in the results?
> >
> > Strings with forced direction (e.g. Arabic with LRO) showed difference.
> > Without my change they were shaped RTL by drawn LTR, with my change
> > shaping and drawing used LTR direction. Please feel free to revert that
> > change if you think it is incorrect.
>
> (I'm going through old bug reports that unfortunately weren't resolved
> at the time.)
>
> Skimming this long bug report, it seems like the fixes Khaled pushed
> fixed the reported issue (but I may well be misreading).
>
> Eli, is there anything more to do here?
No. The original problem was fixed, and a couple of followup issues
were also fixed.
So I'm closing this bug.
^ permalink raw reply [flat|nested] 55+ messages in thread
end of thread, other threads:[~2022-04-29 13:24 UTC | newest]
Thread overview: 55+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-13 20:20 bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n) Kaushal Modi
2018-12-13 20:25 ` Kaushal Modi
2018-12-13 20:31 ` Khaled Hosny
2018-12-13 20:43 ` Kaushal Modi
2018-12-13 20:53 ` Khaled Hosny
2018-12-13 21:04 ` Kaushal Modi
2018-12-14 5:57 ` Eli Zaretskii
2018-12-14 7:48 ` Eli Zaretskii
2018-12-14 7:50 ` Khaled Hosny
2018-12-14 10:03 ` Eli Zaretskii
2018-12-14 11:03 ` Khaled Hosny
2018-12-14 13:42 ` Eli Zaretskii
2018-12-14 15:25 ` Eli Zaretskii
2018-12-17 0:30 ` Glenn Morris
2018-12-17 15:55 ` Eli Zaretskii
2018-12-20 18:58 ` Eli Zaretskii
2018-12-20 20:45 ` Behdad Esfahbod
2018-12-22 8:54 ` Khaled Hosny
2018-12-22 9:06 ` Khaled Hosny
2018-12-22 10:11 ` Eli Zaretskii
2018-12-22 15:15 ` Khaled Hosny
2018-12-22 15:27 ` Behdad Esfahbod
2018-12-22 15:42 ` Khaled Hosny
2018-12-22 15:42 ` Eli Zaretskii
2018-12-22 15:49 ` Khaled Hosny
2018-12-22 16:33 ` Eli Zaretskii
2018-12-22 19:38 ` Eli Zaretskii
2018-12-22 20:59 ` Khaled Hosny
2018-12-23 3:34 ` Eli Zaretskii
2018-12-23 13:51 ` Khaled Hosny
2018-12-23 16:00 ` Eli Zaretskii
2018-12-24 2:08 ` Khaled Hosny
2018-12-24 4:12 ` Kaushal Modi
2018-12-24 16:10 ` Eli Zaretskii
2018-12-24 17:37 ` Khaled Hosny
2018-12-24 18:07 ` Eli Zaretskii
2019-01-05 21:15 ` Khaled Hosny
2019-01-06 16:03 ` Eli Zaretskii
2019-01-27 17:12 ` Eli Zaretskii
2019-01-29 22:25 ` Khaled Hosny
2018-12-29 14:49 ` Eli Zaretskii
2019-01-05 20:53 ` Khaled Hosny
2019-01-05 21:04 ` Khaled Hosny
2019-01-06 17:54 ` Eli Zaretskii
2019-01-27 17:12 ` Eli Zaretskii
2019-01-29 22:33 ` Khaled Hosny
2019-01-06 15:50 ` Eli Zaretskii
2019-01-29 22:29 ` Khaled Hosny
2022-04-29 12:47 ` Lars Ingebrigtsen
2022-04-29 13:24 ` Eli Zaretskii
2019-01-27 17:09 ` Eli Zaretskii
2018-12-24 17:38 ` Benjamin Riefenstahl
2018-12-14 22:47 ` Khaled Hosny
2018-12-16 14:47 ` Benjamin Riefenstahl
2018-12-14 6:45 ` Paul Eggert
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).