From: Kenichi Handa <handa@m17n.org>
To: Yair F <yair.f.lists@gmail.com>
Cc: emacs-devel@gnu.org
Subject: Re: Composing Hebrew diacriticals
Date: Thu, 01 Jul 2010 14:52:23 +0900 [thread overview]
Message-ID: <tl7sk434ulk.fsf@m17n.org> (raw)
In-Reply-To: <AANLkTim3sQzyJ4YQkOzfRHCFhztgLG-CA2vlM84lbwoq@mail.gmail.com> (message from Yair F on Thu, 1 Jul 2010 00:28:36 +0300)
[-- Attachment #1: Type: text/plain, Size: 9138 bytes --]
In article <AANLkTim3sQzyJ4YQkOzfRHCFhztgLG-CA2vlM84lbwoq@mail.gmail.com>, Yair F <yair.f.lists@gmail.com> writes:
> Sorry about that Please find hebrew-sample2.txt the source file.
> Arial-anottated.png is this file displayed using emacs with Arial font.
> The numbers in red refer to the following comments the general flow is
> top-bottom right-left:
> 1. Shin-Dot should be rendered near the right leg. currently it is
> rendered above the centre leg, this is unreradable.
> 2. All points below should be horizontally centred relative to the
> base letter. Currently it seems that they are align to the left.
> Exception for this rule is letters that have a single leg downward
> such as =D7=95, =D7=A8, =D7=93, =D7=96 the points should be rendered direct=
> ly under the
> leg for these letters.
> 3. The Shva point touches Qof's leg. the result is unreadable.
> 4. The Dagesh point is hidden within the Shin letter.
> 5. This is not Hebrew, but the combining dot above should be composed
> with the letter A.
> 6. The Holam point should be left to the leg, and not right. Result is
> unreadable.
> 7. Shuruq point should be left to the vav letter, and not right.
> Result is unreadable.
All those are glyph positioning problems and can be improved
by adding more code to hebrew-shape-gstring.
> > Anyway, for fonts that don't have OpenType tables for Hebrew
> > script, we can do nothing other than artificially adjusting
> > glyph position. =C2=A0Have you seen any other application
> > rendering Hebrew well with that Arial font?
> Openoffice and Firefox correctly render Hebrew points.
??? When I open your hebrew-sample2.txt with oowriter, and
specify Arial font, the rendering is almost (exactly?) the
same as that of Emacs (see the attached image).
I confirmed that Firefox (and all applications using
Pango/harfbuzz; e.g. gedit) surely do better hebrew
rendering with Arial. By reading the code of Pango, I found
that it has a fallback shaping engine that is used for a
font of no hebrew GPOS OpenType tables. Here's the excerpt
from pango/module/hebrew-shaper.c. You'll see that it
checks various character combinations and adjust glyph
offsets properly. But the code has many magic numbers
(e.g. 3.5, 0.7, 0.5, 1/3, 3/5, ...). I think it's a dirty &
ad-hoc hack.
Theoretically, it is possible to do the same thing in the
function hebrew-shape-gstring. But, is it really worth
doing that? Isn't it enough to tell Hebrew users to use
properly desinged OpenType fonts?
============================================================
void
hebrew_shaper_get_cluster_kerning(gunichar *cluster,
gint cluster_length,
PangoRectangle ink_rect[],
/* input and output */
gint width[],
gint x_offset[],
gint y_offset[])
{
int i;
int base_ink_x_offset, base_ink_y_offset, base_ink_width, base_ink_height;
gunichar base_char = cluster[0];
x_offset[0] = 0;
y_offset[0] = 0;
if (cluster_length == 1)
{
/* Make lone 'vav dot' have zero width */
if (base_char == UNI_SHIN_DOT
|| base_char == UNI_SIN_DOT
|| base_char == UNI_HOLAM
) {
x_offset[0] = -ink_rect[0].x - ink_rect[0].width;
width[0] = 0;
}
return;
}
base_ink_x_offset = ink_rect[0].x;
base_ink_y_offset = ink_rect[0].y;
base_ink_width = ink_rect[0].width;
base_ink_height = ink_rect[0].height;
/* Do heuristics */
for (i=1; i<cluster_length; i++)
{
int gl = cluster[i];
x_offset[i] = 0;
y_offset[i] = 0;
/* Check if it is a point */
if (gl < 0x5B0 || gl >= 0x05D0)
continue;
/* Center dot of VAV */
if (gl == UNI_MAPIQ && base_char == UNI_VAV)
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x;
/* If VAV is a vertical bar without a roof, then we
need to make room for the dot by increasing the
cluster width. But how can I check if that is the
case??
*/
/* This is wild, but it does the job of differentiating
between two M$ fonts... Base the decision on the
aspect ratio of the vav...
*/
if (base_ink_height > base_ink_width * 3.5)
{
int j;
double space = 0.7;
double kern = 0.5;
/* Shift all characters to make place for the mapiq */
for (j=0; j<i; j++)
x_offset[j] += ink_rect[i].width*(1+space-kern);
width[cluster_length-1] += ink_rect[i].width*(1+space-kern);
x_offset[i] -= ink_rect[i].width*(kern);
}
}
/* Dot over SHIN */
else if (gl == UNI_SHIN_DOT && base_char == UNI_SHIN)
{
x_offset[i] = base_ink_x_offset + base_ink_width
- ink_rect[i].x - ink_rect[i].width;
}
/* Dot over SIN */
else if (gl == UNI_SIN_DOT && base_char == UNI_SHIN)
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x;
}
/* VOWEL DOT above to any other character than
SHIN or VAV should stick out a bit to the left. */
else if ((gl == UNI_SIN_DOT || gl == UNI_HOLAM)
&& base_char != UNI_SHIN && base_char != UNI_VAV)
{
x_offset[i] = base_ink_x_offset -ink_rect[i].x - ink_rect[i].width * 3/ 2;
}
/* VOWELS under resh or vav are right aligned, if they are
narrower than the characters. Otherwise they are centered.
*/
else if ((base_char == UNI_VAV
|| base_char == UNI_RESH
|| base_char == UNI_YOD
|| base_char == UNI_DALED
)
&& ((gl >= UNI_SHEVA && gl <= UNI_QAMATS) ||
gl == UNI_QUBUTS)
&& ink_rect[i].width < base_ink_width
)
{
x_offset[i] = base_ink_x_offset + base_ink_width
- ink_rect[i].x - ink_rect[i].width;
}
/* VOWELS under FINAL KAF are offset centered and offset in
y */
else if ((base_char == UNI_FINAL_KAF
)
&& ((gl >= UNI_SHEVA && gl <= UNI_QAMATS) ||
gl == UNI_QUBUTS))
{
/* x are at 1/3 to take into accoun the stem */
x_offset[i] = base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 1/3 - ink_rect[i].width/2;
/* Center in y */
y_offset[i] = base_ink_y_offset - ink_rect[i].y
+ base_ink_height * 1/2 - ink_rect[i].height/2;
}
/* MAPIQ in PE or FINAL PE */
else if (gl == UNI_MAPIQ
&& (base_char == UNI_PE || base_char == UNI_FINAL_PE))
{
x_offset[i]= base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 2/3 - ink_rect[i].width/2;
/* Another option is to offset the MAPIQ in y...
glyphs->glyphs[cluster_start_idx+i].geometry.y_offset
-= base_ink_height/5; */
}
/* MAPIQ in SHIN should be moved a bit to the right */
else if (gl == UNI_MAPIQ
&& base_char == UNI_SHIN)
{
x_offset[i]= base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 3/5 - ink_rect[i].width/2;
}
/* MAPIQ in YUD is right aligned */
else if (gl == UNI_MAPIQ
&& base_char == UNI_YOD)
{
x_offset[i]= base_ink_x_offset - ink_rect[i].x;
/* Lower left in y */
y_offset[i] = base_ink_y_offset - ink_rect[i].y
+ base_ink_height - ink_rect[i].height*1.75;
if (base_ink_height > base_ink_width * 2)
{
int j;
double space = 0.7;
double kern = 0.5;
/* Shift all cluster characters to make space for mapiq */
for (j=0; j<i; j++)
x_offset[j] += ink_rect[i].width*(1+space-kern);
width[cluster_length-1] += ink_rect[i].width*(1+space-kern);
}
}
/* VOWEL DOT next to any other character */
else if ((gl == UNI_SIN_DOT || gl == UNI_HOLAM)
&& (base_char != UNI_VAV))
{
x_offset[i] = base_ink_x_offset -ink_rect[i].x;
}
/* Move nikud of taf a bit ... */
else if (base_char == UNI_TAV && gl == UNI_MAPIQ)
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 5/8 - ink_rect[i].width/2;
}
/* Move center dot of characters with a right stem and no
left stem. */
else if (gl == UNI_MAPIQ &&
(base_char == UNI_BET
|| base_char == UNI_DALED
|| base_char == UNI_KAF
|| base_char == UNI_GIMMEL
))
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 3/8 - ink_rect[i].width/2;
}
/* Right align wide nikud under QOF */
else if (base_char == UNI_QOF &&
( (gl >= UNI_HATAF_SEGOL
&& gl <= UNI_HATAF_QAMATZ)
|| (gl >= UNI_TSERE
&& gl<= UNI_QAMATS)
|| (gl == UNI_QUBUTS)))
{
x_offset[i] = base_ink_x_offset + base_ink_width
- ink_rect[i].x - ink_rect[i].width;
}
/* Center by default */
else
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x
+ base_ink_width/2 - ink_rect[i].width/2;
}
}
}
============================================================
> The poetry site
> you mentioned http://www.zemer.co.il/song.asp?id=3D393 uses David and
> being correctly rendered.
> Kate (using pango?) also better render using Arial, David-CLM. It has
> some other issues though, but the result is mostly readable.
As Kate is a KDE application, I think it's not using Pango.
But, if it renders Hebrew with Arial well, it (or rendering
module of KDE/Qt) should have the similar ad-hoc code.
---
Kenichi Handa
handa@m17n.org
[-- Attachment #2: oowriter-arial.png --]
[-- Type: image/png, Size: 79797 bytes --]
next prev parent reply other threads:[~2010-07-01 5:52 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <tl7fx0v9nra.fsf@m17n.org>
2010-06-15 11:02 ` Composing Hebrew diacriticals Kenichi Handa
2010-06-24 6:33 ` Kenichi Handa
2010-06-25 10:16 ` Eli Zaretskii
2010-06-28 16:40 ` Yair F
2010-06-29 8:07 ` Kenichi Handa
2010-06-29 18:57 ` Yair F
2010-06-30 5:27 ` Kenichi Handa
[not found] ` <AANLkTim3sQzyJ4YQkOzfRHCFhztgLG-CA2vlM84lbwoq@mail.gmail.com>
2010-06-30 21:48 ` Fwd: " Yair F
2010-07-01 5:59 ` Miles Bader
2010-07-01 5:52 ` Kenichi Handa [this message]
2010-07-01 20:30 ` Yair F
2010-07-02 7:51 ` Kenichi Handa
2010-07-12 8:17 ` Kenichi Handa
2010-07-12 21:10 ` Yair F
2010-07-13 4:11 ` Kenichi Handa
2010-07-13 4:47 ` Yair F
2010-07-13 12:01 ` Eli Zaretskii
2010-04-30 12:29 Eli Zaretskii
2010-05-05 2:39 ` Kenichi Handa
2010-05-05 15:49 ` David Kastrup
2010-05-05 20:51 ` Eli Zaretskii
2010-05-06 7:20 ` David Kastrup
2010-05-06 0:45 ` Kenichi Handa
2010-05-06 12:14 ` David Kastrup
2010-05-06 13:01 ` Kenichi Handa
2010-05-05 18:01 ` Eli Zaretskii
2010-05-07 11:15 ` Kenichi Handa
2010-05-08 12:51 ` Eli Zaretskii
2010-05-06 14:59 ` Yair F.
2010-05-06 17:41 ` Eli Zaretskii
2010-05-07 0:48 ` Kenichi Handa
2010-05-07 4:41 ` Yair F
2010-05-07 6:23 ` Kenichi Handa
2010-05-07 10:00 ` Yair F
2010-05-07 11:11 ` Kenichi Handa
2010-05-07 9:28 ` Eli Zaretskii
2010-05-10 14:09 ` Yair F
2010-05-11 0:25 ` Kenichi Handa
2010-05-11 12:20 ` Kenichi Handa
2010-05-11 16:22 ` Eli Zaretskii
2010-05-12 8:04 ` Kenichi Handa
2010-05-12 17:35 ` Eli Zaretskii
2010-05-12 19:05 ` Juanma Barranquero
2010-05-13 3:06 ` Eli Zaretskii
2010-05-13 0:42 ` Kenichi Handa
2010-05-14 8:10 ` Kenichi Handa
2010-05-14 10:02 ` Eli Zaretskii
2010-05-14 11:58 ` Kenichi Handa
2010-05-14 13:29 ` Eli Zaretskii
2010-05-14 14:06 ` Eli Zaretskii
[not found] ` <AANLkTilcNB_ntRY_EVS9EyMrqS3GRAp3rHGiXL_3YZuR@mail.gmail.com>
2010-05-15 2:14 ` Kenichi Handa
2010-05-15 21:35 ` Yair F
2010-05-17 4:35 ` Kenichi Handa
2010-05-17 17:32 ` Eli Zaretskii
2010-05-18 0:36 ` Kenichi Handa
2010-05-17 21:08 ` Yair F
2010-05-20 2:09 ` Kenichi Handa
2010-05-25 1:45 ` Kenichi Handa
2010-05-25 20:56 ` Yair F
2010-05-26 0:36 ` Kenichi Handa
2010-05-26 4:37 ` Yair F
2010-05-26 6:00 ` Kenichi Handa
2010-05-26 16:12 ` Yair F
2010-05-27 7:27 ` Kenichi Handa
2010-05-27 21:59 ` Yair F
2010-05-28 0:42 ` Kenichi Handa
2010-06-01 8:58 ` Yair F
2010-05-18 7:29 ` Eli Zaretskii
2010-05-17 13:53 ` Stefan Monnier
2010-05-19 17:23 ` Eli Zaretskii
2010-05-11 21:40 ` Yair F
2010-05-12 3:15 ` Eli Zaretskii
2010-05-12 15:11 ` Yair F
2010-05-12 17:43 ` Eli Zaretskii
2010-05-12 22:01 ` Yair F
2010-05-13 17:14 ` Eli Zaretskii
2010-05-13 19:46 ` Yair F
2010-05-13 19:56 ` Eli Zaretskii
2010-05-13 20:08 ` Yair F
2010-05-14 2:35 ` Miles Bader
2010-05-14 10:45 ` Yair F
2010-05-14 13:05 ` Eli Zaretskii
2010-05-14 13:15 ` Kenichi Handa
2010-05-15 0:46 ` Miles Bader
2010-05-13 0:29 ` Kenichi Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tl7sk434ulk.fsf@m17n.org \
--to=handa@m17n.org \
--cc=emacs-devel@gnu.org \
--cc=yair.f.lists@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).