all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Richard Wordingham <richard.wordingham@ntlworld.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 20173@debbugs.gnu.org
Subject: bug#20173: 24.4; Rendering misallocates combining marks on ligatures
Date: Tue, 24 Mar 2015 08:28:28 +0000	[thread overview]
Message-ID: <20150324082828.6bad0649@JRWUBU2> (raw)
In-Reply-To: <837fu7qcx1.fsf@gnu.org>

On Tue, 24 Mar 2015 05:42:18 +0200
Eli Zaretskii <eliz@gnu.org> wrote:

> If the setting of composition
> rules for Arabic is not the culprit, then what is?  AFAIK, there are
> no rules that guide Emacs's shaping except what's in
> composition-function-table.  Beyond that, the only other factor is the
> font backend and how it shapes glyphs given the chunks of text Emacs
> presents to it.

The font backend on Unixy systems consists of three components - m17n
(shaping control), libotf (OTL look-up implementation) and Freetype
(glyph rendering).  The glue between them is in Emacs,
most relevantly in function ftfont_drive_otf() in ftfont.c.

My analysis of the problem, which could quite easily be wrong, is as
follows.  To control the positioning of marks for the mark2ligature
lookup, it is necessary to record in some fashion which component of
the ligature a mark applies to.  I cannot see this information being
stored.  The information should be generated and used by libotf, but
needs to be stored between callbacks of ftfont_drive_otf() by m17n.
(The initial settings are implicit in the sequence of codepoints.)
Storing this information would, so far as I can see, require a change to
ftfont_drive_otf().

I may be able to change my font to work round this bug; I can certainly
change it to hide the symptom I observed.  The solution will be to
categorise the ligature NAA <U+1A36, U+1A63> as a base glyph rather
than as a ligature glyph.

There are other places where the HarfBuzz rendering system, which aims
to be compatible with Windows, uses this information.  In particular,
marks applied to a ligature are only allowed to ligate if they apply to
the same component of a ligature, and mark2mark positioning only
applies if the two marks apply to the same component.  This logic is
described as 'the most tricky part of the OpenType specification'.
Part of the trickiness may be that it seems not to have been
published externally (possibly not even internally) by Microsoft.  The
guiding principle seems to be that one should do the right things to the
marks on a ligature of Arabic consonants.

I have become well-acquainted with this logic because the 'same
component logic' seems to be applied by HarfBuzz regardless of whether
the marks are preceded by a base glyph or a ligature glyph.  The
Windows logic seems similar, but is subtly different.  I hit problems
with the Tai Tham NAA ligature, because the marks above on its two
components do interact.  The marks below should probably also interact,
but combinations where I would expect them to have to interact seem not
to occur in natural text.

> > As to what needs fixing in the Arabic section of misc-lang.el:

> Thanks, I will look into these.

You might want to first check whether composed Arabic is
usable. Doesn't making each word a grapheme cluster makes editing
unpleasant?  It might be worth restricting the clustering to
cursively connected sequences of letters within a word.

Richard.





  reply	other threads:[~2015-03-24  8:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-23  1:06 bug#20173: 24.4; Rendering misallocates combining marks on ligatures Richard Wordingham
2015-03-23 15:38 ` Eli Zaretskii
2015-03-23 22:41   ` Richard Wordingham
2015-03-24  3:42     ` Eli Zaretskii
2015-03-24  8:28       ` Richard Wordingham [this message]
2015-03-24 17:03         ` Eli Zaretskii
2015-03-24 20:22           ` Richard Wordingham
2015-03-27  9:04           ` Richard Wordingham
2015-03-27  9:54             ` Eli Zaretskii
2020-08-17 22:45               ` Stefan Kangas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150324082828.6bad0649@JRWUBU2 \
    --to=richard.wordingham@ntlworld.com \
    --cc=20173@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.