From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Date: Thu, 21 May 2020 17:11:00 +0300 Message-ID: <837dx55qff.fsf@gnu.org> References: <20200517165953.000044d2@web.de> <83lflqblp0.fsf@gnu.org> <83ftbybio3.fsf@gnu.org> <83zha69xs2.fsf@gnu.org> <83367x9qeq.fsf@gnu.org> <0ccae2a4-533b-d15c-2884-c2f00b067776@gmail.com> <83wo5987mk.fsf@gnu.org> <99d4beed-88ae-b5cd-3ecb-a44325c8a1dc@gmail.com> <20200518215908.GA57594@breton.holly.idiocy.org> <83mu6481v3.fsf@gnu.org> <75a90563-51b4-d3b8-4832-fc0e2542af0d@gmail.com> <83blmi7hys.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="65815"; mail-complaints-to="usenet@ciao.gmane.io" Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org To: Pip Cet Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu May 21 16:12:15 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jblvi-000H1r-Iw for ged-emacs-devel@m.gmane-mx.org; Thu, 21 May 2020 16:12:14 +0200 Original-Received: from localhost ([::1]:42604 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jblvh-0007Hm-Ld for ged-emacs-devel@m.gmane-mx.org; Thu, 21 May 2020 10:12:13 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:60296) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbluX-0005jk-P1 for emacs-devel@gnu.org; Thu, 21 May 2020 10:11:01 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:44119) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jbluW-0004VS-Ft; Thu, 21 May 2020 10:11:00 -0400 Original-Received: from [176.228.60.248] (port=2801 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jbluV-0005ul-4g; Thu, 21 May 2020 10:11:00 -0400 In-Reply-To: (message from Pip Cet on Thu, 21 May 2020 10:01:03 +0000) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:251169 Archived-At: > From: Pip Cet > Date: Thu, 21 May 2020 10:01:03 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > If we only want this feature for ASCII ligatures, then it sounds like > > a limitation to me (and frankly, somewhat unclean as features go), > > Not "only for ASCII ligatures", but not "any conceivable combination > of codepoints into glyphs" either. Just those supported by the font > and Harfbuzz. > > > but > > if we really want this only for these limited cases, we will need to > > somehow indicate to the display engine which ligatures are to be > > handled like this and which aren't. > > Well, we now know that fonts can provide information about how a > ligature is to be split into one-dimensional slices; The question is: do we want to show those carets for all the character compositions, even if the information is provided? If not, we will have to indicate somehow whether they should or shouldn't be shown for each particular grapheme cluster. > Of course that means that Emacs behavior would depend on the font > tables in ways it currently doesn't. That's a problem. It isn't a problem to depend on that if most fonts provide this information. Then we could simply say this is not supported when the information is not in the font. But if many fonts that support ligatures don't provide this information, we will need to have some fallback, like assume that every codepoint has the same share of the ligature's width. the fact that other applications use a simplistic heuristic and not the information in the fonts suggests that either the information is not readily available or there are some other problems with using it. > > Right, the actual implementation will have to be different. In > > particular, I think that if ligatures will use automatic compositions, > > the information you need is already stored in the composition table > > and reachable from the glyph string, so you don't need to invoke the > > shaper again. > > Well, I'm sorry to bring up a different (though somewhat related > issue), but kerning is also an issue: we need a shaper to get that > right, not just a composition table, right? Automatic compositions already use the shaper, see autocmp_chars. > > I see you implemented this for static compositions, which are > > semi-obsolete. > > I'm sorry, I'm afraid I don't understand. This should handle any > composition the shaper does, and only those, but slices up everything > horizontally by default. I'm talking about the changes in gui_produce_glyphs. Its high-level structure is basically if (it->what == IT_CHARACTER) { ... /* handles character glyphs */ } else if (it->what == IT_COMPOSITION && it->cmp_it.ch < 0) { ... /* A static compositions. */ } else if (it->what == IT_COMPOSITION) { /* A dynamic (automatic) composition. */ } [...] You made changes only in the "static compositions" part. That code handles compositions created by compose-region. The "modern" way of composing text in Emacs uses automatic compositions, which are controlled by data in composition-function-table. This is where we call the shaping engine to produce the glyphs according to rules stored in the font. I don't see in your patch any changes that affect ligatures created by automatic compositions; did I miss something? If you use the automatic compositions route, then the information you need, i.e. the number of clusters in the shaped text and the overall width of the ligature, is already produced by the shaper and stored in the "gstring" object in the composition table, see the description of that object in the doc string of composition-get-gstring. So there should be no need to invoke the shaper inside gui_produce_glyphs and elsewhere. (If we want to use the carets information from the font, we will probably need to extend the gstring object to store that as well, and extend the shape method to extract this information when available.) > > Also, I don't see the code which moves point inside > > the ligature; Emacs will not allow doing that by default. In > > particular, how did you tell the display code to show the cursor on > > the middle 'f', not on the first one? Did I miss something? > > I produce three "struct glyph"s for "ffi": each has width one third of > the actual font glyph, and stores, in convoluted form, information > about which slice of the font glyph is to be actually drawn. Ah, okay, I missed that. But producing 3 glyphs instead of just one is not necessarily the best idea, I think. As you point out, one problem will be with splitting the ligature across lines. Another problem is more expensive display. And we won't be able to display the ligature as a single glyph, for those who want that, at least not easily. > > And finally, you said you intended to do this via row->clip, but this > > patch does something very different. What changed your mind? > > I was surprised this no longer seemed to be strictly necessary: as far > as the display code is concerned, we're dealing with three separate > glyphs with overhang areas, and those are already handled by the > cursor-drawing code. Yes. But if we return to a single glyph, then we'd need to do some clipping. > On the other hand, it deals with kerning as well as ligatures. You mean, kerning of simple characters, for which we don't produce ligatures? Or kerning within ligatures? If the latter, then I don't see why we'd need that: font designers already design the ligatures to have the optimal kerning, no?