From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#20140: 24.4; M17n shaper output rejected Date: Mon, 14 Feb 2022 15:19:36 +0200 Message-ID: <83mtitpouv.fsf@gnu.org> References: <20150318222040.4066e6e9@JRWUBU2> <87r18jk5nr.fsf@gnus.org> <83v8xv2icg.fsf@gnu.org> <20220205225251.08a0faab@JRWUBU2> <831r06rbwk.fsf@gnu.org> <20220213205310.0b8a715c@JRWUBU2> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="16257"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 20140@debbugs.gnu.org, larsi@gnus.org To: Richard Wordingham Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Feb 14 15:10:31 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nJc3h-00043k-KO for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 14 Feb 2022 15:10:29 +0100 Original-Received: from localhost ([::1]:45394 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nJc3f-0006ZN-SU for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 14 Feb 2022 09:10:27 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:59756) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nJbGz-0000fk-Jp for bug-gnu-emacs@gnu.org; Mon, 14 Feb 2022 08:20:10 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:46320) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nJbGs-0003IV-27 for bug-gnu-emacs@gnu.org; Mon, 14 Feb 2022 08:20:09 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1nJbGr-0007KP-Tx for bug-gnu-emacs@gnu.org; Mon, 14 Feb 2022 08:20:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 14 Feb 2022 13:20:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 20140 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: moreinfo Original-Received: via spool by 20140-submit@debbugs.gnu.org id=B20140.164484478528140 (code B ref 20140); Mon, 14 Feb 2022 13:20:01 +0000 Original-Received: (at 20140) by debbugs.gnu.org; 14 Feb 2022 13:19:45 +0000 Original-Received: from localhost ([127.0.0.1]:40217 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nJbGb-0007Jo-1Z for submit@debbugs.gnu.org; Mon, 14 Feb 2022 08:19:45 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:52998) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nJbGZ-0007Jb-KT for 20140@debbugs.gnu.org; Mon, 14 Feb 2022 08:19:44 -0500 Original-Received: from [2001:470:142:3::e] (port=58248 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nJbGT-0003Go-Et; Mon, 14 Feb 2022 08:19:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=K5QcbmxpFEpBGQQYPpBdO7uThM6AwgdR3sW6FSGuUfI=; b=awZX3EymjqsH RiBdpyiZeL1uO3+s8tqpRylnfoVAv66EBZOsk2saYe4ImoGCXg8FSwmVs1HusGC1l2GEg53a2YjfT WS4q0Evwol/Rlwo2hMLVDwM2MPwQVdcDbSpF7FmC1dg09z926on/7VTsv8bF+NG9LwOuNWOXq+DNg adwv0M95tJO9deJejSgxNLjXmedVDRwBq/1I24ACCwI7AsSdl2+5wy2EUByyOHvv6TDEyBamp71gC AIUyFpX2ltbNwDLWFQfUWhFqFAdNFoVuRLE6nHhK3MSVywmk9ftczDAeNER6Bj50M6A8YGT7XSR7M vnpH35M0Ce5bmF8Vk/MMZA==; Original-Received: from [87.69.77.57] (port=4706 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nJbGS-0007DT-UI; Mon, 14 Feb 2022 08:19:37 -0500 In-Reply-To: <20220213205310.0b8a715c@JRWUBU2> (message from Richard Wordingham on Sun, 13 Feb 2022 20:53:10 +0000) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:226882 Archived-At: > Date: Sun, 13 Feb 2022 20:53:10 +0000 > From: Richard Wordingham > Cc: larsi@gnus.org, 20140@debbugs.gnu.org > > On Sun, 13 Feb 2022 18:04:11 +0200 > Eli Zaretskii wrote: > > > But that didn't seem to work well enough: e.g., some marks in your > > "sample text" didn't combine with letters, as I think they should. > > Which ones? Sorry, that was my faulty testing: I tested a half-baked change. Your rules do work correctly, AFAICT. But I have 2 questions: 1) Why do we need this part of the composition rules: (vector "." 0 'font-shape-gstring) This matches just one character, so what do we want to accomplish by this rule? A single character cannot "self-compose", can it? 2) Since tai-tham-composable-pattern always starts with what you denote as "C", how about setting up only entries of composition-function-table that correspond to those characters, i.e.: (let ((elt (list (vector tai-tham-composable-pattern 0 'font-shape-gstring) ))) (set-char-table-range composition-function-table '(#x1A20 . #x1A54) elt) (set-char-table-range composition-function-table '(#x1A80 . #x1A89) elt) (set-char-table-range composition-function-table '(#x1A90 . #x1A99) elt) (set-char-table-range composition-function-table '(#x1AA0 . #x1AAD) elt)) Do you see any problems with that? > I did suspect the problem was writing '\u1A7C' instead of > '\u1a7c', but I'm no longer so sure. No, that's not a problem. > You should also add CGJ and ZWNJ, and some people may appreciate ZWJ - > the Khottabun font has ligatures involving ZWJ, though it may just be > an experimental feature - and ultimately WJ, for when someone writes a > Tai Tham word breaker. How should I add CGJ and ZWNJ? What are the rules? > Oh, and Thai and Lao mai t(r)i and mai chat(t)awa and U+0324 > COMBINING DIAERESIS BELOW turn up occasionally - U+0324 is supported > in Thep's Khottabun font, and my Da Lekh series supports Thai mai > tri and mai chattawa. These characters seem to work with HarfBuzz. Not sure I understand: what patterns/rules should be added for these? > If using the native Windows renderer is an option with Emacs, then 'A > Tai Tham KH New' works better than 'A Tai Tham KH New V3'. We still support Uniscribe, but prefer HarfBuzz, because MS deprecated Uniscribe. We cannot support DirectWrite, because its APIs are C++-only, and no one has shown whether and how to call them from C. > > Btw, is there a way to get all the examples from your > > https://wrdingham.co.uk/lanna/renderer_test.htm as a UTF-8 encoded > > text file? I'd like to test the Emacs rendering with all of the > > examples, but copy-pasting each example separately from the browser is > > not my idea of useful time investment. So if you could provide the > > examples as a downloadable text file, I'd appreciate. > > As buried (you're not the only one to have overlooked it) in the > penultimate paragraph of 'Content and Layout' section, "The test words > may, in principle, be extracted quite simply from this web page. Each > test 'word' is the content of the first cell in each row whose class is > tst1. For convenience*, I have extracted the first two cells in such > rows, along with titles, to a CSV file." The file is rt.csv in the > same directory. Thanks, I will use that.