From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Date: Wed, 27 May 2020 22:19:05 +0300 Message-ID: <83o8q9rxsm.fsf@gnu.org> References: <20200517165953.000044d2@web.de> <83lflqblp0.fsf@gnu.org> <83ftbybio3.fsf@gnu.org> <83zha69xs2.fsf@gnu.org> <83367x9qeq.fsf@gnu.org> <0ccae2a4-533b-d15c-2884-c2f00b067776@gmail.com> <83wo5987mk.fsf@gnu.org> <99d4beed-88ae-b5cd-3ecb-a44325c8a1dc@gmail.com> <20200518215908.GA57594@breton.holly.idiocy.org> <83mu6481v3.fsf@gnu.org> <75a90563-51b4-d3b8-4832-fc0e2542af0d@gmail.com> <83blmi7hys.fsf@gnu.org> <837dx55qff.fsf@gnu.org> <834ks95cmz.fsf@gnu.org> <4faa291f-f2df-36d1-73d5-332b93a9b6d8@gmail.com> <83wo544hx5.fsf@gnu.org> <831rnc43ih.fsf@gnu.org> <83ftbs2jr5.fsf@gnu.org> <83lflj16jn.fsf@gnu.org> <83eerb145r.fsf@gnu.org> <831rnb0zld.fsf@gnu.org> <83mu5yzquj.fsf@gnu.org> <838shizk35.fsf@gnu.org> <831rn9xs98.fsf@gnu.org> <83mu5utr7j.fsf@gnu.org> <83tv01s3lr.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="87500"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Pip Cet Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed May 27 21:20:00 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1je1ao-000Mek-Mh for ged-emacs-devel@m.gmane-mx.org; Wed, 27 May 2020 21:19:58 +0200 Original-Received: from localhost ([::1]:57520 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1je1an-00012s-Px for ged-emacs-devel@m.gmane-mx.org; Wed, 27 May 2020 15:19:57 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54780) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1je1aB-0000Kv-Gx for emacs-devel@gnu.org; Wed, 27 May 2020 15:19:20 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:43313) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1je1aA-0004tE-Cs; Wed, 27 May 2020 15:19:19 -0400 Original-Received: from [176.228.60.248] (port=1836 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1je1a9-0000ub-Mx; Wed, 27 May 2020 15:19:18 -0400 In-Reply-To: (message from Pip Cet on Wed, 27 May 2020 18:42:07 +0000) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:251518 Archived-At: > From: Pip Cet > Date: Wed, 27 May 2020 18:42:07 +0000 > Cc: emacs-devel@gnu.org > > > What did you compare, exactly? On the one hand, the code you posted > > here, which took 128 characters around each character to be displayed? > > No. Not anything like that code. Then your numbers cannot be meaningfully reasoned about, because no one knows what you did. > There's no reason to believe the composite.c regexp design will > perform adequately. It doesn't. I guess in your eyes only your code performs adequately. Sorry, this means any further discussion with you on these matters is futile. I regret to have wasted so much time trying to explain how this stuff works. I will try to be smarter next time when you ask some question. > (It's not enough. Open emacs -Q etc/HELLO, place point on the lam in > "aleikum", and hit control-space. The shape changes to something > incorrect.) A known limitation of our handling of faces in conjunction with character composition. Finding the reason is left as an exercise. > > > - "entering" glyphs, instead of treating them as atomic > > > > Why is that needed? A ligature is a single display entity, that's why > > fonts ligate. > > "ffi" is not. When I enter "official" C-a C-f C-f, point MUST be on > the second f. That doesn't require producing separate glyphs. > > It doesn't and it shouldn't! Text of display strings and overlay > > strings is completely isolated from buffer text, and is even > > bidi-reordered independently. This is by design. > > Unacceptable design for my use case, then. This is the design of the Emacs display engine. If it doesn't fit your case, your case cannot be had in Emacs without rewriting the display code. > No, it wouldn't be. If two letters appear with no intervening space, > they need to be kerned and ligated if appropriate, no matter where > they come from. If people want a ZWNJ, that's perfectly available to > them. That's not what display and overlay strings are for in Emacs. > > Fixed limits and fixed strings are two different things. You can > > match strings of many different lengths up to a limit. > > Which effectively means you can match strings of that limited length. Except that there's no limit, of course. > > The 3 previous characters are rarely needed, certainly not for English > > ligatures, because you can detect the sequence by the first character. > > Precisely the same argument applies to my 16-character limit. A script > in which a glyph depends on something happening 16 codepoints onwards, > or back, is extremely unlikely. You are wrong. Please read this: https://lists.freedesktop.org/archives/harfbuzz/2020-May/007517.html https://lists.freedesktop.org/archives/harfbuzz/2020-May/007521.html This is what is needed for doing ligatures The Right Way. Collecting an arbitrary number of codepoint doesn't cut it. And in any case, I was talking about the need to look _backward_, i.e. when the character that triggers the composition is not the first one in the sequence of the characters to be composed. This is usually needed as an optimization: if you have 2-character sequences where the second character is one of a much smaller set than the first, then using the second character as an anchor will use up less memory when you set up composition-function-table. A case in point is a base character and a diacritic. How many characters you need _forward_ is an entirely different issue. > It needs to be modified, significantly, to support entering glyphs, to > support kerning, and to support things like ligating across a buffer > text / display string boundary. Two of these are not needed or are outright wrong, and the third doesn't need anything, the shaper already does that with any text you pass through it. > But, seriously, you're still willing to argue that point shouldn't be > able to enter the "ffi" glyph? Not even if the user wants it? Because > if so, I suggest we interrupt the discussion here. See above. I indeed see no reason to continue this discussion, as evidently any progress here is impossible with your attitude in place.