From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Richard Wordingham via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#20140: 24.4; M17n shaper output rejected Date: Mon, 14 Feb 2022 23:26:23 +0000 Message-ID: <20220214232623.30534d5a@JRWUBU2> References: <20150318222040.4066e6e9@JRWUBU2> <87r18jk5nr.fsf@gnus.org> <83v8xv2icg.fsf@gnu.org> <20220205225251.08a0faab@JRWUBU2> <83sfsmpmxb.fsf@gnu.org> <20220213211152.03e2990a@JRWUBU2> <83leydpok0.fsf@gnu.org> Reply-To: Richard Wordingham Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39604"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 20140@debbugs.gnu.org, larsi@gnus.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Feb 15 00:30:29 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nJknd-000A7q-3x for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 15 Feb 2022 00:30:29 +0100 Original-Received: from localhost ([::1]:46500 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nJknb-0000vW-HC for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 14 Feb 2022 18:30:27 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:55278) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nJkkI-0006uX-Gj for bug-gnu-emacs@gnu.org; Mon, 14 Feb 2022 18:27:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:49086) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nJkkI-0001oj-3L for bug-gnu-emacs@gnu.org; Mon, 14 Feb 2022 18:27:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1nJkkH-0001ZJ-WC for bug-gnu-emacs@gnu.org; Mon, 14 Feb 2022 18:27:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Richard Wordingham Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 14 Feb 2022 23:27:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 20140 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: moreinfo Original-Received: via spool by 20140-submit@debbugs.gnu.org id=B20140.16448811955995 (code B ref 20140); Mon, 14 Feb 2022 23:27:01 +0000 Original-Received: (at 20140) by debbugs.gnu.org; 14 Feb 2022 23:26:35 +0000 Original-Received: from localhost ([127.0.0.1]:42983 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nJkjr-0001Yb-5U for submit@debbugs.gnu.org; Mon, 14 Feb 2022 18:26:35 -0500 Original-Received: from smtpq2.tb.ukmail.iss.as9143.net ([212.54.57.97]:60658) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nJkjm-0001YM-CE for 20140@debbugs.gnu.org; Mon, 14 Feb 2022 18:26:34 -0500 Original-Received: from [212.54.57.106] (helo=csmtp2.tb.ukmail.iss.as9143.net) by smtpq2.tb.ukmail.iss.as9143.net with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nJkjg-0007GB-Ig for 20140@debbugs.gnu.org; Tue, 15 Feb 2022 00:26:24 +0100 Original-Received: from JRWUBU2 ([82.27.122.109]) by cmsmtp with ESMTP id Jkjfni04PYDyuJkjgn95kN; Tue, 15 Feb 2022 00:26:24 +0100 X-SourceIP: 82.27.122.109 X-Spam: 0 X-Authority: v=2.4 cv=eu3Mc6lX c=1 sm=1 tr=0 ts=620ae520 cx=a_exe a=lZfnwhydZ+7bl6OdZ0zTBw==:117 a=lZfnwhydZ+7bl6OdZ0zTBw==:17 a=IkcTkHD0fZMA:10 a=oGFeUVbbRNcA:10 a=mDV3o1hIAAAA:8 a=NLZqzBF-AAAA:8 a=OocQHUDgAAAA:8 a=AZnJWaAPAAAA:8 a=57gHaMvYwbckcmK-Q84A:9 a=QEXdDO2ut3YA:10 a=qskxWB65Wv0A:10 a=_FVE-zBwftR9WsbkzFJk:22 a=wW_WBVUImv98JQXhvVPZ:22 a=xUZTl98r3Qw_uB5NK3jt:22 a=T2rBzvJ0ivks0o3LBaDr:22 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ntlworld.com; s=meg.feb2017; t=1644881184; bh=ApM1PkxI6nwM3YKegoRHLR7Y06HrhFNarY7Y4NSKZ3E=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=rKxdMisF0tYT/5BfESJVPaT/hX6Xsq/hsTHtBdZ5AlkbPdIsWh/5viC+a8ac+y3q4 jcD9jkpm7v+b75sB6voxAG11Vm9ARTLBZVHUrysGbhVjjQTkozvFxz8Br2ECTUCXIs CLdK7DcWWZaUBYwY3ut6KOC8EaV4zipA1XtULSMj3HB7nA1nYWlqher5VQNWej6Wpz oGAP04kJdGS12VRo4taI75wmXEvGs0tumN8EXPsACvjtUxMhns/IP6fAf8sVwRPQwc otd4JbBwXvnprSZqMoAVqKq2/TLUEWU3sOkps4x8hs7I2eU/mlX8iNmjIh2CoLXLRR ZmRiL2uGugSzA== In-Reply-To: <83leydpok0.fsf@gnu.org> X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-pc-linux-gnu) X-CMAE-Envelope: MS4xfHntkY6W/2U5fGmfWF+R3n3ZwHqzy4qntOMj5OH3kgV46f9qJxDp1ndLIqHC4lzpxO5ZXCI8l8EUob3aQEL+xK+WC0ensrYxPNrUhoCjf2ge2N4pNUu1 XqZ9Gr1C/TEJ/T76hOQwnzq8bcnxRrkNUGwdHvlbrnKOUrYTywIN43SAW8KKFjJGzIWiF1DStIwLPAChvd0m6QNf2B6RpPEML9W98QGN3S5dOJzji5/xZ0B4 Jjmx7GN46o963HAq057BRw== X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:226942 Archived-At: On Mon, 14 Feb 2022 15:26:07 +0200 Eli Zaretskii wrote: > > Date: Sun, 13 Feb 2022 21:11:52 +0000 > > From: Richard Wordingham > > Cc: larsi@gnus.org, 20140@debbugs.gnu.org > No, that's not true. I'm not aware of any such limitation; AFAIK > Arabic shaping works correctly in Emacs, certainly with HarfBuzz and > Emacs 27 or later. >=20 > Or maybe I misunderstand what you mean by "typewriter-like" fonts? > Can you give an example of a non-typewriter-like font for Arabic that > I can find on MS-Windows and try? Not off the top of my head, but compare =D9=84=D8=AD=D8=AC with the present= ation form =E2=80=8E=EF=B3=8A U+FCCA ARABIC LIGATURE LAM WITH HAH INITIAL FORM for the= first two letters. The lam part is a vertical line in the middle of the glyph; the 'hah' part forms the lower part of the glyph. > > There would be a similar problem with the use of Tai Khuen or other > > tunnelling fonts for Northern Thai if you used the current mechanism > > for advancing character by character. Tunnelling fonts write parts > > of one cluster under the next. The Tai Khuen fonts I've seen do > > this by relying on characteristics of Tai Khuen spelling. The > > rules don't hold for Northern Thai, and consequently the subscript > > portions of successive orthographic syllables can overwrite one > > another. A sophisticated font could check for clashes, but that > > needs the orthographic syllables to be passed to the shaper > > together. =20 >=20 > I'm not sure I understand. Does HarfBuzz know about these advancement > features? We rely on HarfBuzz to give us back as many grapheme > clusters as it sees fit for a given chunk of text, and we expect each > grapheme cluster to include glyphs with relative offsets as needed by > the script and the font. No, the fonts rely on the grammar of Tai Khuen. If an orthographic syllable contains U+1A6C TAI THAM VOWEL SIGN OA BELOW, there will be a following orthographic syllable in the same phonetic syllable, and it will consist of a single consonant with no tail and possible some marks above. The font designers therefore do not worry about the effect on the advance width; there will be room for U+1A6C below the next orthographic syllable. If you want to see details now, enter =E1=A9=89=E1=A9=A0=E1=A8=BE=E1=A9=AC=E1=A9=81 =E1=A9=89=E1=A9=A0=E1=A8=BE= =E1=A9=B3=E1=A8=B6=E1=A9=A5=E1=A9=A0=E1=A8=AF =E1=A9=89=E1=A9=A0=E1=A8=BE= =E1=A9=AC=E1=A9=B4=E1=A8=B6=E1=A9=A5=E1=A9=A0=E1=A8=AF in the 'Play Area' t= ext box of https://wrdingham.co.uk/lanna/renderer_test.htm. The first word is spelt the same in Northern Thai and Tai Khuen. As you switch the font from Lamphun to A Tai Tham KH (with ccmp enabled if you are using IE 11), the glyphs at the bottom of the word spread out to use the available space. The next two words are 'Dr Nit' written in Tai Khuen and Northern Thai. The word for 'Dr', /m=C9=94=CB=90/, is spelt quite differently in the two languages, though the consonants are the same. Both have a vowel above, but the Northern Thai also has U+1A6C below, as in the first word. When A Tai Tham KH is selected as the font, it clashes badly with the bottom of the second syllable, 'Nit'.=20 This phenomenon of a vowel below expanding below the next consonant also occurs in Northern Thai, but I don't know of any Northern Thai font that is clever enough to do this, because checking for space below the next consonant is fiddly. > IOW, this job is delegated to the shaping engine, such as HarfBuzz; > Emacs just takes the glyphs and offsets HarfBuzz gives us and blindly > obeys them. The problem is that font writers tend to make assumptions about the language their font will be used for. The second is that with a good tunnelling font, HarfBuzz needs to know what comes in the next syllable. At present, using a tunnelling font for Tai Tham risks clashes when used with Emacs. The Tai Khuen fonts look good, but are not suitable for writing Northern Thai. Richard.