unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Richard Wordingham via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 20140@debbugs.gnu.org, larsi@gnus.org
Subject: bug#20140: 24.4; M17n shaper output rejected
Date: Mon, 14 Feb 2022 23:26:23 +0000	[thread overview]
Message-ID: <20220214232623.30534d5a@JRWUBU2> (raw)
In-Reply-To: <83leydpok0.fsf@gnu.org>

On Mon, 14 Feb 2022 15:26:07 +0200
Eli Zaretskii <eliz@gnu.org> wrote:

> > Date: Sun, 13 Feb 2022 21:11:52 +0000
> > From: Richard Wordingham <richard.wordingham@ntlworld.com>
> > Cc: larsi@gnus.org, 20140@debbugs.gnu.org

> No, that's not true.  I'm not aware of any such limitation; AFAIK
> Arabic shaping works correctly in Emacs, certainly with HarfBuzz and
> Emacs 27 or later.
> 
> Or maybe I misunderstand what you mean by "typewriter-like" fonts?
> Can you give an example of a non-typewriter-like font for Arabic that
> I can find on MS-Windows and try?

Not off the top of my head, but compare لحج with the presentation form
‎ﳊ U+FCCA ARABIC LIGATURE LAM WITH HAH INITIAL FORM for the first two
letters.  The lam part is a vertical line in the middle of the glyph;
the 'hah' part forms the lower part of the glyph.

> > There would be a similar problem with the use of Tai Khuen or other
> > tunnelling fonts for Northern Thai if you used the current mechanism
> > for advancing character by character.  Tunnelling fonts write parts
> > of one cluster under the next.  The Tai Khuen fonts I've seen do
> > this by relying on characteristics of Tai Khuen spelling.  The
> > rules don't hold for Northern Thai, and consequently the subscript
> > portions of successive orthographic syllables can overwrite one
> > another.  A sophisticated font could check for clashes, but that
> > needs the orthographic syllables to be passed to the shaper
> > together.  
> 
> I'm not sure I understand.  Does HarfBuzz know about these advancement
> features?  We rely on HarfBuzz to give us back as many grapheme
> clusters as it sees fit for a given chunk of text, and we expect each
> grapheme cluster to include glyphs with relative offsets as needed by
> the script and the font.

No, the fonts rely on the grammar of Tai Khuen.  If an orthographic
syllable contains U+1A6C TAI THAM VOWEL SIGN OA BELOW, there will be a
following orthographic syllable in the same phonetic syllable, and
it will consist of a single consonant with no tail and possible some
marks above.  The font designers therefore do not worry about the
effect on the advance width; there will be room for U+1A6C below the
next orthographic syllable.  If you want to see details now, enter
ᩉ᩠ᨾᩬᩁ ᩉ᩠ᨾᩳᨶᩥ᩠ᨯ ᩉ᩠ᨾᩬᩴᨶᩥ᩠ᨯ in the 'Play Area' text box of
https://wrdingham.co.uk/lanna/renderer_test.htm.  The first word is
spelt the same in Northern Thai and Tai Khuen.  As you switch the font
from Lamphun to A Tai Tham KH (with ccmp enabled if you are using IE
11), the glyphs at the bottom of the word spread out to use the
available space.  The next two words are 'Dr Nit' written in Tai Khuen
and Northern Thai.  The word for 'Dr', /mɔː/, is spelt quite
differently in the two languages, though the consonants are the same.
Both have a vowel above, but the Northern Thai also has U+1A6C below,
as in the first word. When A Tai Tham KH is selected as the font, it
clashes badly with the bottom of the second syllable, 'Nit'. 

This phenomenon of a vowel below expanding below the next consonant
also occurs in Northern Thai, but I don't know of any Northern Thai
font that is clever enough to do this, because checking for space below
the next consonant is fiddly.

> IOW, this job is delegated to the shaping engine, such as HarfBuzz;
> Emacs just takes the glyphs and offsets HarfBuzz gives us and blindly
> obeys them.

The problem is that font writers tend to make assumptions about the
language their font will be used for.  The second is that with a good
tunnelling font, HarfBuzz needs to know what comes in the next
syllable.  At present, using a tunnelling font for Tai Tham risks
clashes when used with Emacs.  The Tai Khuen fonts look good, but are
not suitable for writing Northern Thai.

Richard.





  reply	other threads:[~2022-02-14 23:26 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-18 22:20 bug#20140: 24.4; M17n shaper output rejected Richard Wordingham
2015-03-19  3:43 ` Eli Zaretskii
2015-03-21  8:33 ` K. Handa
2015-03-21 17:20   ` Wolfgang Jenkner
2015-03-21 17:58   ` Richard Wordingham
2015-03-21 18:26     ` Eli Zaretskii
2015-03-25 14:25     ` K. Handa
2015-03-25 21:45       ` Richard Wordingham
2015-04-05 19:48       ` Richard Wordingham
2022-02-03 21:21 ` Lars Ingebrigtsen
2022-02-04  7:37   ` Eli Zaretskii
2022-02-05 22:52     ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-06  8:11       ` Eli Zaretskii
2022-02-06 22:09         ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-07 14:04           ` Eli Zaretskii
2022-02-07 23:38             ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-08 22:13         ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-12 18:54           ` Eli Zaretskii
2022-02-13 16:04       ` Eli Zaretskii
2022-02-13 20:53         ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-14 13:19           ` Eli Zaretskii
2022-02-14 22:14             ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-15  1:27               ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-16 15:13                 ` Eli Zaretskii
2022-02-16 15:12               ` Eli Zaretskii
2022-02-16 15:11           ` Eli Zaretskii
2022-02-13 19:49       ` Eli Zaretskii
2022-02-13 21:11         ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-14 13:26           ` Eli Zaretskii
2022-02-14 23:26             ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2022-02-15 14:40               ` Eli Zaretskii
2022-02-15 21:06                 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-16 13:15                   ` Eli Zaretskii
2022-02-16 19:01                     ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-16 19:20                       ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220214232623.30534d5a@JRWUBU2 \
    --to=bug-gnu-emacs@gnu.org \
    --cc=20140@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=larsi@gnus.org \
    --cc=richard.wordingham@ntlworld.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).