From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Date: Tue, 26 May 2020 22:46:08 +0300 Message-ID: <83mu5utr7j.fsf@gnu.org> References: <20200517165953.000044d2@web.de> <83lflqblp0.fsf@gnu.org> <83ftbybio3.fsf@gnu.org> <83zha69xs2.fsf@gnu.org> <83367x9qeq.fsf@gnu.org> <0ccae2a4-533b-d15c-2884-c2f00b067776@gmail.com> <83wo5987mk.fsf@gnu.org> <99d4beed-88ae-b5cd-3ecb-a44325c8a1dc@gmail.com> <20200518215908.GA57594@breton.holly.idiocy.org> <83mu6481v3.fsf@gnu.org> <75a90563-51b4-d3b8-4832-fc0e2542af0d@gmail.com> <83blmi7hys.fsf@gnu.org> <837dx55qff.fsf@gnu.org> <834ks95cmz.fsf@gnu.org> <4faa291f-f2df-36d1-73d5-332b93a9b6d8@gmail.com> <83wo544hx5.fsf@gnu.org> <831rnc43ih.fsf@gnu.org> <83ftbs2jr5.fsf@gnu.org> <83lflj16jn.fsf@gnu.org> <83eerb145r.fsf@gnu.org> <831rnb0zld.fsf@gnu.org> <83mu5yzquj.fsf@gnu.org> <838shizk35.fsf@gnu.org> <831rn9xs98.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="111963"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Pip Cet Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue May 26 21:47:09 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jdfXZ-000T23-Hk for ged-emacs-devel@m.gmane-mx.org; Tue, 26 May 2020 21:47:09 +0200 Original-Received: from localhost ([::1]:49106 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jdfXY-0007M3-FD for ged-emacs-devel@m.gmane-mx.org; Tue, 26 May 2020 15:47:08 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:47216) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jdfWo-0006m6-Pv for emacs-devel@gnu.org; Tue, 26 May 2020 15:46:22 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:50779) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jdfWo-0004Wv-FW; Tue, 26 May 2020 15:46:22 -0400 Original-Received: from [176.228.60.248] (port=2956 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jdfWn-0000OZ-Ti; Tue, 26 May 2020 15:46:22 -0400 In-Reply-To: (message from Pip Cet on Tue, 26 May 2020 18:13:55 +0000) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:251473 Archived-At: > From: Pip Cet > Date: Tue, 26 May 2020 18:13:55 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > Assuming that the alternative for selecting the "context" is found, > > and composite.c is augmented to apply it instead of the regexps, why > > not use the rest of the automatic composition code to produce the > > glyphs and display them? > > I chose not to do that for a patch which I have stated repeatedly was > not in any way a finalized design, and I don't see any good reason to > do it for a real patch, either, so far. Why not? How about trying to do that before giving up? > (I'll be honest: I strongly suspect that the code is too slow, we know > it to be buggy, and it's simply too different from what I actually > want to benefit from sharing the code). > > > The code which does that exists and works, > > (I suspect: slowly) Any measurements to back that up? E.g., is scrolling through etc/HELLO especially slow, once all the fonts were loaded (i.e. each character in the file was displayed at least once)? > > and is tested by years of use. > > It's unusable for me in Emacs 26.3. How so? what doesn't work? (And why are you using Emacs 26 and not Emacs 27, where we support HarfBuzz and made several improvements and bugfixes in the character composition area?) > > It already solves the problems of look-ahead, > > If it does so efficiently, I'll certainly try reusing that code. But I > strongly suspect it doesn't. Why suspect? why not try and see what does and doesn't work, what is and isn't efficient? > > of wrapping long lines, > > Very poorly, for my purposes. How so? what doesn't wrap correctly, and why? > > and others, including (but not limited to) the dreaded bidi thing. > > Looking for "bidi" in composite.c, the only relevant thing I see is a FIXME. That's because you look in the wrong place. Once again, try looking at etc/HELLO, there are portions of it that need both bidi and compositions. I can explain how it works (the code is spread over several files), but please believe me that it does, it passed the HarfBuzz developers' eyes most of whom are native Arabic and Farsi speakers, and wouldn't allow us to display Arabic script incorrectly. The whole point of using the existing code is that you don't _need_ to understand how exactly we handle the bidi reordering when character compositions are required. It just works, for all you care. It did take several iterations to get right at the time; why would you want to repeat all that, when the code is there to use and extend? > > Why reinvent that wheel when we already have it, and it works well? > > First, because it doesn't work that well for my purposes; What doesn't work? please be specific. > second, precisely because it works well for the purposes of others, > and I'd like to have as little impact as possible on existing use > cases. They should just continue working, and so far they do. You are thinking of breaking those other cases by your changes? But we haven't yet established that changes are needed, let alone which changes. How do you know you will break anything at all? > > > Ligatures and kerning (right now, for LTR text). Is that a small > > > problem because of the lack of RTL support? > > > > Yes, of course. > > Why? Because the features you are talking about should "just work" in Emacs. Not only for some use cases and some scripts -- that is not how we develop features. Features that work only for some cases are broken and will draw bug reports. They make Emacs look unclean and unprofessional. And there's no need to add such half-broken features because code that supports much broader class of use cases already exists, you just need to use it and maybe extend and augment it a bit. > The code shouldn't break horribly for RTL text (it doesn't). It _will_ break for RTL text, you just didn't yet see it because you only tested it in simple use cases. UAX#9 defines a lot of optional features, including multi-level directional overrides and embeddings, it isn't just right-to-left vs left-to-right. Again, there's no need for you to reinvent this wheel, we already have it figured out. > > What's more, we already have the code which implements all > > that, so I don't understand why you want to bypass it. > > We have something that superficially results in a similar screen > layout to what I want, but that actually represents display elements > in a way that makes them unusable for my purposes. Then please describe what doesn't fit your purpose, and let's focus on extending the existing code to do what's missing. Throwing everything away and starting anew is not the right way, it's a huge waste of energy and time to implement something that we already have. It is also a maintenance burden in the long run. Please note: I'm not talking about the regexp part -- that part you anyway will need to decide how to extend or augment. I'm telling you right here and now that blindly taking a fixed amount of surrounding text will not be acceptable. You can either come up with some smarter regexp (and you are wrong: the regexps in composition-function-table do NOT have to match only fixed strings, you can see that they don't in the part of the table we set up for the Arabic script); or you can decide on something more complex, like a function. Either way, the amount of text that this will pick up and pass to the shaper should be reasonable and should be determined by some understandable rules. And those rules must be controllable from Lisp. But that is a separate part of the problem that you will need to solve, and you will need to solve it whether or not you use character compositions. What I _am_ saying is that the rest of the machinery that implements automatic compositions does exactly what you need: it calls the shaper, handling LTR and RTL text as needed, then lays out the glyphs the shaper returns in a way that handles all the usual stuff our users expect, such as line wrapping and truncation. It is silly to disregard that code, so please don't.