* Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) @ 2020-05-17 10:41 Julius Pfrommer 2020-05-17 14:09 ` Arthur Miller 2020-05-17 14:35 ` Eli Zaretskii 0 siblings, 2 replies; 145+ messages in thread From: Julius Pfrommer @ 2020-05-17 10:41 UTC (permalink / raw) To: emacs-devel Hi all, during the recent discussion on "Emacs being too square", I recalled a few projects that use OpenGL for terminal emulators [1,2]. With good performance, smooth scrolling and the possibility to add more visual *bling*. I had a good look at Emacs' code-base to see if similar approaches could be used. As you can imagine, I got lost in a forest of #ifdef for different platforms and GUI toolkits. The code looks scary to touch. If you don't have access to *all supported platform*, it is likely that changes break a platform you could not test locally. To make the code-base less scary, there should be more code-sharing across GUI platforms. And this is indeed possible! The GTK-based Emacs GUI can use Cairo for rendering. Cairo + FreeType + HarfBuzz (calling it CFH for simplicity) is available for the other supported platforms as well (besides pure TTY): - GnuSTEP [http://wiki.gnustep.org/index.php/Backend] - Raw Xlib [https://www.cairographics.org/Xlib/] - Windows+MacOS [https://www.cairographics.org/download/] Big portions of the platform-specific GUI code could be unified based on the CFH libraries. Is a hard dependency on the CFH libraries imaginable? Maybe one of the platforms is a "low-hanging fruit" to get things going. As every major refactoring, there should be a series of small steps in order to keep things stable. Thank you for the hard work put into this amazing piece of software! Regards, Julius [1] https://sw.kovidgoyal.net/kitty/ [2] https://github.com/alacritty/alacritty ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 10:41 Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer @ 2020-05-17 14:09 ` Arthur Miller 2020-05-17 14:30 ` Eli Zaretskii 2020-05-17 14:35 ` Eli Zaretskii 1 sibling, 1 reply; 145+ messages in thread From: Arthur Miller @ 2020-05-17 14:09 UTC (permalink / raw) To: Julius Pfrommer; +Cc: emacs-devel Julius Pfrommer <julius.pfrommer@web.de> writes: > Hi all, > > during the recent discussion on "Emacs being too square", I recalled a > few projects that use OpenGL for terminal emulators [1,2]. With good > performance, smooth scrolling and the possibility to add more visual > *bling*. > > I had a good look at Emacs' code-base to see if similar approaches > could be used. As you can imagine, I got lost in a forest of #ifdef for > different platforms and GUI toolkits. The code looks scary to touch. If > you don't have access to *all supported platform*, it is likely that > changes break a platform you could not test locally. I have been looking into same, some time ago and recently, and I experience same problem. A forest of cases, all coded into same place in giant files of 5K+ lines :-). > To make the code-base less scary, there should be more code-sharing > across GUI platforms. And this is indeed possible! Emacs and Emacs src could benefit of some modularization and refactoring definitely. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 14:09 ` Arthur Miller @ 2020-05-17 14:30 ` Eli Zaretskii 2020-05-17 15:06 ` Arthur Miller 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-17 14:30 UTC (permalink / raw) To: emacs-devel, Arthur Miller, Julius Pfrommer On May 17, 2020 5:09:08 PM GMT+03:00, Arthur Miller <arthur.miller@live.com> wrote: > Julius Pfrommer <julius.pfrommer@web.de> writes: > > > Hi all, > > > > during the recent discussion on "Emacs being too square", I recalled > a > > few projects that use OpenGL for terminal emulators [1,2]. With good > > performance, smooth scrolling and the possibility to add more visual > > *bling*. > > > > I had a good look at Emacs' code-base to see if similar approaches > > could be used. As you can imagine, I got lost in a forest of #ifdef > for > > different platforms and GUI toolkits. The code looks scary to touch. > If > > you don't have access to *all supported platform*, it is likely that > > changes break a platform you could not test locally. > > I have been looking into same, some time ago and recently, and I > experience same problem. A forest of cases, all coded into same place > in > giant files of 5K+ lines :-). > > > To make the code-base less scary, there should be more code-sharing > > across GUI platforms. And this is indeed possible! > Emacs and Emacs src could benefit of some modularization and > refactoring > definitely. I suggest to go through the archives and the Git logs to see how many such efforts have been made and are already in the codebase. It isn't like the advantages of this are unclear to the development team, or that nothing is being done in that direction. Far from it. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 14:30 ` Eli Zaretskii @ 2020-05-17 15:06 ` Arthur Miller 2020-05-17 15:56 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Arthur Miller @ 2020-05-17 15:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Julius Pfrommer, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > On May 17, 2020 5:09:08 PM GMT+03:00, Arthur Miller <arthur.miller@live.com> wrote: >> Julius Pfrommer <julius.pfrommer@web.de> writes: >> >> > Hi all, >> > >> > during the recent discussion on "Emacs being too square", I recalled >> a >> > few projects that use OpenGL for terminal emulators [1,2]. With good >> > performance, smooth scrolling and the possibility to add more visual >> > *bling*. >> > >> > I had a good look at Emacs' code-base to see if similar approaches >> > could be used. As you can imagine, I got lost in a forest of #ifdef >> for >> > different platforms and GUI toolkits. The code looks scary to touch. >> If >> > you don't have access to *all supported platform*, it is likely that >> > changes break a platform you could not test locally. >> >> I have been looking into same, some time ago and recently, and I >> experience same problem. A forest of cases, all coded into same place >> in >> giant files of 5K+ lines :-). >> >> > To make the code-base less scary, there should be more code-sharing >> > across GUI platforms. And this is indeed possible! >> Emacs and Emacs src could benefit of some modularization and >> refactoring >> definitely. > > I suggest to go through the archives and the Git logs to see how many such > efforts have been made and are already in the codebase. It isn't like the > advantages of this are unclear to the development team, or that nothing is being > done in that direction. Far from it. I understand that, and I am conscius myself that you devs are aware of it and that you would probably do something about it if it was less work than it probably is. I believe you it is not easy considering the long history of Emacs. I am just reflecting over how I feel every time I peek into souces. It feels like I am looking into sqlite ammalgamation :-). ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 15:06 ` Arthur Miller @ 2020-05-17 15:56 ` Eli Zaretskii 2020-05-17 16:50 ` Arthur Miller 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-17 15:56 UTC (permalink / raw) To: Arthur Miller; +Cc: julius.pfrommer, emacs-devel > From: Arthur Miller <arthur.miller@live.com> > Cc: emacs-devel@gnu.org, Julius Pfrommer <julius.pfrommer@web.de> > Date: Sun, 17 May 2020 17:06:35 +0200 > > I am just reflecting over how I feel every time I peek > into souces. It feels like I am looking into sqlite ammalgamation :-). It was worse just a year ago. It will be better a year from now. patches are welcome. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 15:56 ` Eli Zaretskii @ 2020-05-17 16:50 ` Arthur Miller 2020-05-17 17:06 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Arthur Miller @ 2020-05-17 16:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: julius.pfrommer, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Arthur Miller <arthur.miller@live.com> >> Cc: emacs-devel@gnu.org, Julius Pfrommer <julius.pfrommer@web.de> >> Date: Sun, 17 May 2020 17:06:35 +0200 >> >> I am just reflecting over how I feel every time I peek >> into souces. It feels like I am looking into sqlite ammalgamation :-). > > It was worse just a year ago. It will be better a year from now. > patches are welcome. Are there any guidelines if one would like to restructure something? For example, I am looking a lot in image.c I was playing with line drawing on an image the other day, and I would love to not have to look into ns and gdi code while working with x11 & cairo only. It is so easy to miss if a single line is actually outside of some platform ifdef and similar. It is so messy, at least if one is n00b like me :-). ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 16:50 ` Arthur Miller @ 2020-05-17 17:06 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-17 17:06 UTC (permalink / raw) To: Arthur Miller; +Cc: julius.pfrommer, emacs-devel > From: Arthur Miller <arthur.miller@live.com> > Cc: emacs-devel@gnu.org, julius.pfrommer@web.de > Date: Sun, 17 May 2020 18:50:04 +0200 > > Are there any guidelines if one would like to restructure something? The guideline is to factor any GUI code into common part and platform-specific part, and define interfaces for the latter whose implementation is in the corresponding *term.[cm] or *fns.[cm]. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 10:41 Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer 2020-05-17 14:09 ` Arthur Miller @ 2020-05-17 14:35 ` Eli Zaretskii 2020-05-17 14:59 ` Julius Pfrommer 1 sibling, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-17 14:35 UTC (permalink / raw) To: emacs-devel, Julius Pfrommer On May 17, 2020 1:41:25 PM GMT+03:00, Julius Pfrommer <julius.pfrommer@web.de> wrote: > Hi all, > > during the recent discussion on "Emacs being too square", I recalled a > few projects that use OpenGL for terminal emulators [1,2]. With good > performance, smooth scrolling and the possibility to add more visual > *bling*. > > I had a good look at Emacs' code-base to see if similar approaches > could be used. As you can imagine, I got lost in a forest of #ifdef > for > different platforms and GUI toolkits. The code looks scary to touch. > If > you don't have access to *all supported platform*, it is likely that > changes break a platform you could not test locally. > > To make the code-base less scary, there should be more code-sharing > across GUI platforms. And this is indeed possible! > > The GTK-based Emacs GUI can use Cairo for rendering. Cairo + FreeType > + > HarfBuzz (calling it CFH for simplicity) is available for the other > supported platforms as well (besides pure TTY): > > - GnuSTEP [http://wiki.gnustep.org/index.php/Backend] > - Raw Xlib [https://www.cairographics.org/Xlib/] > - Windows+MacOS [https://www.cairographics.org/download/] > > Big portions of the platform-specific GUI code could be unified based > on > the CFH libraries. Is a hard dependency on the CFH libraries > imaginable? > > Maybe one of the platforms is a "low-hanging fruit" to get things > going. > As every major refactoring, there should be a series of small steps in > order to keep things stable. > > Thank you for the hard work put into this amazing piece of software! > > Regards, Julius > > [1] https://sw.kovidgoyal.net/kitty/ > [2] https://github.com/alacritty/alacritty Any work in this direction is and always has been welcome. The practical problem with that is that you need to have access to all the supported platforms to make sure the refactoring works. FWIW, I'm not sure I share your optimism regarding the Cairo way, I think it requires something from the system as well, so it might be not so easy. And the GUI toolkits are AFAIU a separate issue, not directly related to how we draw to the glass. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 14:35 ` Eli Zaretskii @ 2020-05-17 14:59 ` Julius Pfrommer 2020-05-17 15:55 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Julius Pfrommer @ 2020-05-17 14:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli, > Any work in this direction is and always has been welcome. The > practical problem with that is that you need to have access to all > the supported platforms to make sure the refactoring works. > > FWIW, I'm not sure I share your optimism regarding the Cairo way, I > think it requires something from the system as well, so it might be > not so easy. > > And the GUI toolkits are AFAIU a separate issue, not directly related > to how we draw to the glass. I am well aware of the effort to keep the many different platforms alive. Let me phrase the question differently: Would it be okay to have a hard dependency on the Cairo+FreeType+Harfbuzz (CFH) libraries, as they are available everywhere? It would be a pity to invest time into a direction that is infeasible from the outset. Even on Linux, this would unlock quite a few simplifications. I count at least three font handling "backends" here. Regards, Julius ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 14:59 ` Julius Pfrommer @ 2020-05-17 15:55 ` Eli Zaretskii 2020-05-17 16:28 ` Pip Cet 2020-05-17 18:28 ` Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer 0 siblings, 2 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-17 15:55 UTC (permalink / raw) To: Julius Pfrommer; +Cc: emacs-devel > Date: Sun, 17 May 2020 16:59:53 +0200 > From: Julius Pfrommer <julius.pfrommer@web.de> > Cc: emacs-devel@gnu.org > > Let me phrase the question differently: Would it be okay to have a hard > dependency on the Cairo+FreeType+Harfbuzz (CFH) libraries, as they are > available everywhere? First, we need to establish that this is a solution, and for what problem(s). It is important to realize that the GUI backends we use handle much more than just drawing text, they need to be able to display GUI widgets, frame and window decorations (menu bar, tool bar, scroll bars, the frame title, etc.), and much more. Is the configuration you propose capable of doing all that? I don't think the answer will be full and definitive until "Someone" walks through all the APIs we implement in x/w32/ns/fns.c and x/w32/ns/term.c, and makes sure they all can be covered. Next, please be aware that we already made the decision to use HarfBuzz as our main text-shaping engine. X and w32 already use it; for NS someone has to write the code (and they are not very likely to do so because macOS users consider the native text shaping more feature-rich). Dropping the other font backends is a matter of time, but it could take a long time. In any case, the font backend is not the main issue here; in particular, the likes of FreeType are hardly even seen except on very low level of the code. It's the other aspects of GUI code that bothers me much more. > Even on Linux, this would unlock quite a few simplifications. I count > at least three font handling "backends" here. Down to 2 and one deprecated one on master. Bu again, font backends is a relatively easy problem, and it is being dealt with. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 15:55 ` Eli Zaretskii @ 2020-05-17 16:28 ` Pip Cet 2020-05-17 17:00 ` Eli Zaretskii 2020-05-17 18:28 ` Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer 1 sibling, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-17 16:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, Julius Pfrommer On Sun, May 17, 2020 at 3:56 PM Eli Zaretskii <eliz@gnu.org> wrote: > > Date: Sun, 17 May 2020 16:59:53 +0200 > > From: Julius Pfrommer <julius.pfrommer@web.de> > > Cc: emacs-devel@gnu.org > Next, please be aware that we already made the decision to use > HarfBuzz as our main text-shaping engine. That's a decision that, having just played with HarfBuzz, I find puzzling. It appears to have no practical support for treating ligatures as anything but monolithic glyphs: is there a documented way of getting HarfBuzz to tell you which part of the "ffi" ligature is the middle "f"? I suspect the answer is "there are some languages whose scripts don't allow for the equivalent operation, so we won't support it at all, as a matter of principle". I'm not sure PangoCairo does better, but whatever Libreoffice uses appears to get the job done, so at least one display engine out there solves this problem. (This is assuming we want kerning, ligatures, and subpixel rendering for English text. "Real" text shaping, composition, reordrant glyphs, and bidi concerns are something that I can't really comment on, beyond admitting that, of course, supporting the world's major languages at all is more important than supporting English with the typographic finesse we currently lack). Years ago, I ran a WebAssembly version of Emacs in a web browser. (Back then, I used a terminal emulator written in JavaScript.) I'd certainly like to do that again some day, and I think a hard dependency on Cairo and FreeType would make that even harder. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 16:28 ` Pip Cet @ 2020-05-17 17:00 ` Eli Zaretskii 2020-05-17 18:50 ` Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-17 17:00 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel, julius.pfrommer > From: Pip Cet <pipcet@gmail.com> > Date: Sun, 17 May 2020 16:28:30 +0000 > Cc: Julius Pfrommer <julius.pfrommer@web.de>, emacs-devel@gnu.org > > On Sun, May 17, 2020 at 3:56 PM Eli Zaretskii <eliz@gnu.org> wrote: > > > Date: Sun, 17 May 2020 16:59:53 +0200 > > > From: Julius Pfrommer <julius.pfrommer@web.de> > > > Cc: emacs-devel@gnu.org > > Next, please be aware that we already made the decision to use > > HarfBuzz as our main text-shaping engine. > > That's a decision that, having just played with HarfBuzz, I find > puzzling. It appears to have no practical support for treating > ligatures as anything but monolithic glyphs: is there a documented way > of getting HarfBuzz to tell you which part of the "ffi" ligature is > the middle "f"? You are accusing HarfBuzz of crimes it didn't commit ;-) What you see is not produced by HarfBuzz, it's produced by Emacs. HarfBuzz (and any other text-shaping engine we ever used) has a very simple job: Emacs hands it a string of codepoints, and HarfBuzz returns a series of font glyphs to be used to display that string. That's all. All the rest is the Emacs display engine. And yes, the current design is that a ligature (like any other "grapheme cluster" produced by character composition) is a single "display element": you move across all of it with a single C-f/C-b. The only exception to this rule is that we allow DEL (but not C-d or Delete) to erase individual codepoints going back from the end of the grapheme cluster -- to facilitate editing ligatures and other composed characters. This is the minimum "editing" capability that the user must have, and I don't think I've heard complaints that it wasn't enough. But if required, we could easily add special forward and backward movements that could "enter" the composed character, we just need to figure out how to display the result in order to give the user some visual feedback. (Without visual feedback, I think you can have it today if you customize global-disable-point-adjustment to a non-nil value.) In any case, the question "which part of the ligature corresponds to some codepoint" is meaningless in the context of ligation and complex text shaping: a sequence of N codepoints in general produces M font glyphs, where M can be smaller, equal, or greater than N. The relation between the N codepoints and M glyphs is many-to-many. > I'm not sure PangoCairo does better, but whatever Libreoffice uses > appears to get the job done What job is that? > (This is assuming we want kerning, ligatures, and subpixel rendering > for English text. "Real" text shaping, composition, reordrant glyphs, > and bidi concerns are something that I can't really comment on, beyond > admitting that, of course, supporting the world's major languages at > all is more important than supporting English with the typographic > finesse we currently lack). The truth is that "we" the Emacs project don't want to know anything about ligatures, we want to delegate that job to the shaper. That's the shaper's job, and HarfBuzz does its job very well and stays on top of the relevant technological advances. > Years ago, I ran a WebAssembly version of Emacs in a web browser. > (Back then, I used a terminal emulator written in JavaScript.) I'd > certainly like to do that again some day, and I think a hard > dependency on Cairo and FreeType would make that even harder. I think there's some measure of confusion here: AFAIR we don't use Cairo for text shaping, only for its display. IOW, we tell Cairo to display this and that glyphs, after HarfBuzz returned them. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 17:00 ` Eli Zaretskii @ 2020-05-17 18:50 ` Pip Cet 2020-05-17 19:17 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-17 18:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, julius.pfrommer On Sun, May 17, 2020 at 5:00 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Sun, 17 May 2020 16:28:30 +0000 > > Cc: Julius Pfrommer <julius.pfrommer@web.de>, emacs-devel@gnu.org > > > > On Sun, May 17, 2020 at 3:56 PM Eli Zaretskii <eliz@gnu.org> wrote: > > > > Date: Sun, 17 May 2020 16:59:53 +0200 > > > > From: Julius Pfrommer <julius.pfrommer@web.de> > > > > Cc: emacs-devel@gnu.org > > > Next, please be aware that we already made the decision to use > > > HarfBuzz as our main text-shaping engine. > > > > That's a decision that, having just played with HarfBuzz, I find > > puzzling. It appears to have no practical support for treating > > ligatures as anything but monolithic glyphs: is there a documented way > > of getting HarfBuzz to tell you which part of the "ffi" ligature is > > the middle "f"? > > You are accusing HarfBuzz of crimes it didn't commit ;-) What you see > is not produced by HarfBuzz, it's produced by Emacs. I don't think that's true. > HarfBuzz (and any other text-shaping engine we ever used) has a very > simple job: Emacs hands it a string of codepoints, and HarfBuzz > returns a series of font glyphs to be used to display that string. > That's all. All the rest is the Emacs display engine. HarfBuzz also tells you which codepoints are used for which glyphs. It should also, for languages where it can do so, tell you which codepoints are used for which subglyphs. It fails to do the latter. (I'm aware of what the Emacs display engine does; I'm, obviously, not accusing HarfBuzz of failing to present ligatures, because that's easily fixable. What isn't easily fixable is going back from the ligature glyph to its subglyphs. LibreOffice does it, and I wonder how, because the alternative is jumping back and forth between ligatures and individual characters depending on where PT is, and that looks horrible.) > And yes, the current design is that a ligature (like any other > "grapheme cluster" produced by character composition) is a single > "display element": you move across all of it with a single C-f/C-b. I'm using a different design :-) That one is simply unworkable for English and its limited traditional set of ligatures. > In any case, the question "which part of the ligature corresponds to > some codepoint" is meaningless in the context of ligation and complex > text shaping: No, it's not. It's meaningless for some languages, but not for English and its limited set of traditional ligatures. That a problem cannot be solved in general is no excuse to refuse to solve it in the specific cases where it can be. > > I'm not sure PangoCairo does better, but whatever Libreoffice uses > > appears to get the job done > > What job is that? LibreOffice highlights sub-glyphs of ligatures correctly. I enter "official", and it renders <o> <ffi> <c> <i> <a> <l>. I move the cursor right twice, and it highlights precisely what it should, the middle "f" of the ligature glyph. > > (This is assuming we want kerning, ligatures, and subpixel rendering > > for English text. "Real" text shaping, composition, reordrant glyphs, > > and bidi concerns are something that I can't really comment on, beyond > > admitting that, of course, supporting the world's major languages at > > all is more important than supporting English with the typographic > > finesse we currently lack). > > The truth is that "we" the Emacs project don't want to know anything > about ligatures, we want to delegate that job to the shaper. I don't see how that's true. Treating a ligature as a single character for entry purposes is simply unworkable for English. It might be okay for other languages, but for English, we really need to display "ffi" correctly and still allow it to be edited as three characters. > That's > the shaper's job, and HarfBuzz does its job very well and stays on top > of the relevant technological advances. I don't see any evidence for that positive statement about HarfBuzz: out of the box, Emacs fails miserably to do anything about English ligatures. Trying to find a way to fix it, I ran into HarfBuzz limitations that appear to make it impossible to use it to deal with English ligatures. It might deal very well with other languages and their ligatures, but for English text, it fails to do what TeX did since its inception. > > Years ago, I ran a WebAssembly version of Emacs in a web browser. > > (Back then, I used a terminal emulator written in JavaScript.) I'd > > certainly like to do that again some day, and I think a hard > > dependency on Cairo and FreeType would make that even harder. > > I think there's some measure of confusion here: AFAIR we don't use > Cairo for text shaping, only for its display. IOW, we tell Cairo to > display this and that glyphs, after HarfBuzz returned them. Yes, that's correct. Which means that a WebAssembly version of Emacs would need to bundle Cairo, even though it would prefer to simply render things in the browser using HTML 5 canvases or something similar. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 18:50 ` Pip Cet @ 2020-05-17 19:17 ` Eli Zaretskii 2020-05-18 16:08 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-17 19:17 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel, julius.pfrommer > From: Pip Cet <pipcet@gmail.com> > Date: Sun, 17 May 2020 18:50:19 +0000 > Cc: julius.pfrommer@web.de, emacs-devel@gnu.org > > HarfBuzz also tells you which codepoints are used for which glyphs. It > should also, for languages where it can do so, tell you which > codepoints are used for which subglyphs. It fails to do the latter. No, it doesn't fail. You can see what it tells us in the display of the composition produced by "C-u C-x =". > That one is simply unworkable for English and its limited traditional > set of ligatures. The main reason we want ligatures in Emacs is for displaying program source. Latin ligatures are not the main reason. But I see no reason we couldn't do what you want, it's just the question of someone who'd need to write the code. The information is there. > LibreOffice highlights sub-glyphs of ligatures correctly. I enter > "official", and it renders <o> <ffi> <c> <i> <a> <l>. I move the > cursor right twice, and it highlights precisely what it should, the > middle "f" of the ligature glyph. We can do that in Emacs as well. The information is there, we just need to use it. For Latin ligatures that information will allow the display you describe. Doing that for other scripts would be harder, and the results will be less one-to-one. > > The truth is that "we" the Emacs project don't want to know anything > > about ligatures, we want to delegate that job to the shaper. > > I don't see how that's true. Treating a ligature as a single character > for entry purposes is simply unworkable for English. I didn't say we must treat ligatures as a single character, I just said we do that now. But that has nothing to do with the fact that we want all the information about the ligature to come from the shaper. > out of the box, Emacs fails miserably to do anything about English > ligatures. Trying to find a way to fix it, I ran into HarfBuzz > limitations that appear to make it impossible to use it to deal with > English ligatures. It might deal very well with other languages and > their ligatures, but for English text, it fails to do what TeX did > since its inception. I don't think this is right, but since you haven't shown any code, or what you tried to do, or which HarfBuzz limitations you allude to, it is hard to be more specific. I can only suggest, again, to look at the output of "C-u C-x =" -- that information comes directly from HarfBuzz. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-17 19:17 ` Eli Zaretskii @ 2020-05-18 16:08 ` Eli Zaretskii 2020-05-18 16:45 ` tomas ` (3 more replies) 0 siblings, 4 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-18 16:08 UTC (permalink / raw) To: pipcet; +Cc: emacs-devel > Date: Sun, 17 May 2020 22:17:17 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org, julius.pfrommer@web.de > > > LibreOffice highlights sub-glyphs of ligatures correctly. I enter > > "official", and it renders <o> <ffi> <c> <i> <a> <l>. I move the > > cursor right twice, and it highlights precisely what it should, the > > middle "f" of the ligature glyph. > > We can do that in Emacs as well. The information is there, we just > need to use it. For Latin ligatures that information will allow the > display you describe. Doing that for other scripts would be harder, > and the results will be less one-to-one. On second thought, I think I misunderstood you. If the font that is used shows "ffi" as a _single_ glyph ffi, and LibreOffice indeed highlights parts of this glyph, then I'd like to know how it does that, and how far does this capability extend. I mean, what does it do with ligatures like ae, displayed as æ -- does it highlight the common vertical stroke for both parts? And what about "st", displayed as st -- this has a curved "hand" connecting s and t -- to which of the 2 does it belong for the purposes of highlighting? There's also "hv" displayed as ƕ, let alone "fs" displayed as ẞ and "fz" displayed as ß. IOW, I really don't think I understand how this could work even for what you call "English ligatures". Do you know how they do it? The information I said we get from HarfBuzz is returned when HarfBuzz produces a grapheme cluster from several font glyphs. When the result is a single font glyph, that information just says which of the original codepoints are to be displayed as that single glyph, it doesn't provide any sub-glyph information. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 16:08 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii @ 2020-05-18 16:45 ` tomas 2020-05-18 16:49 ` Eli Zaretskii 2020-05-18 17:05 ` Ligatures Stefan Monnier ` (2 subsequent siblings) 3 siblings, 1 reply; 145+ messages in thread From: tomas @ 2020-05-18 16:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pipcet, emacs-devel [-- Attachment #1: Type: text/plain, Size: 866 bytes --] On Mon, May 18, 2020 at 07:08:45PM +0300, Eli Zaretskii wrote: > > Date: Sun, 17 May 2020 22:17:17 +0300 > > From: Eli Zaretskii <eliz@gnu.org> > > Cc: emacs-devel@gnu.org, julius.pfrommer@web.de > > > > > LibreOffice highlights sub-glyphs of ligatures correctly. I enter > > > "official", and it renders <o> <ffi> <c> <i> <a> <l>. I move the > > > cursor right twice, and it highlights precisely what it should, the > > > middle "f" of the ligature glyph. [...] > On second thought, I think I misunderstood you. If the font that is > used shows "ffi" as a _single_ glyph ffi, and LibreOffice indeed > highlights parts of this glyph, then I'd like to know how it does > that [...] Didn't work for me [1]. It treated the whole ligature as one "character". Cheers [1] LibreOffice 6.1.5.2 10(Build:2), Debian GNU/Linux (buster). -- tomás [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 16:45 ` tomas @ 2020-05-18 16:49 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-18 16:49 UTC (permalink / raw) To: tomas; +Cc: pipcet, emacs-devel > Date: Mon, 18 May 2020 18:45:43 +0200 > Cc: pipcet@gmail.com, emacs-devel@gnu.org > From: <tomas@tuxteam.de> > > > On second thought, I think I misunderstood you. If the font that is > > used shows "ffi" as a _single_ glyph ffi, and LibreOffice indeed > > highlights parts of this glyph, then I'd like to know how it does > > that [...] > > Didn't work for me [1]. It treated the whole ligature as one "character". That's what I'd expect. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 16:08 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii 2020-05-18 16:45 ` tomas @ 2020-05-18 17:05 ` Stefan Monnier 2020-05-18 17:18 ` Ligatures Eli Zaretskii 2020-05-18 17:24 ` Ligatures tomas 2020-05-18 17:31 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Clément Pit-Claudel 2020-05-19 5:43 ` Ligatures ASSI 3 siblings, 2 replies; 145+ messages in thread From: Stefan Monnier @ 2020-05-18 17:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pipcet, emacs-devel [ I know nothing about the underlying APIs and such, so speaking here only as a random user. ] > On second thought, I think I misunderstood you. If the font that is > used shows "ffi" as a _single_ glyph ffi, and LibreOffice indeed > highlights parts of this glyph, then I'd like to know how it does > that, and how far does this capability extend. I mean, what does it > do with ligatures like ae, displayed as æ -- does it highlight the > common vertical stroke for both parts? And what about "st", displayed > as st -- this has a curved "hand" connecting s and t -- to which of the > 2 does it belong for the purposes of highlighting? As a mere user I wouldn't care very much about this detail: I'd just want the cursor to have 2 different positions depending on whether I'm on the "s" or on the "t", and hopefully those two positions are sufficiently self-evident that I don't have to read a manual to understand which is which. So, maybe we don't need very much info: all we need is a boolean which tells us whether the glyph should be treated atomically or not. When not treating it atomically, we would (somewhat arbitrarily) divide the glyph horizontally into N equal sized "subglyphs" and draw the cursor on the corresponding subglyph. If Harfbuzz could tell us more precisely how to divide the glyph into subglyphs, we could do a better job, of course. Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 17:05 ` Ligatures Stefan Monnier @ 2020-05-18 17:18 ` Eli Zaretskii 2020-05-18 19:19 ` Ligatures Pip Cet 2020-05-18 17:24 ` Ligatures tomas 1 sibling, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-18 17:18 UTC (permalink / raw) To: Stefan Monnier; +Cc: pipcet, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: pipcet@gmail.com, emacs-devel@gnu.org > Date: Mon, 18 May 2020 13:05:53 -0400 > > So, maybe we don't need very much info: all we need is a boolean which > tells us whether the glyph should be treated atomically or not. > When not treating it atomically, we would (somewhat arbitrarily) divide > the glyph horizontally into N equal sized "subglyphs" and draw the > cursor on the corresponding subglyph. That strikes me as not a very user-friendly UX. Especially if you keep in mind that glyphs can be composed into a grapheme cluster using 2D offsets, not just left-right one-dimensional offsets. An alternative which might be nicer is to "split" the composition: display it as if a ZWNJ character was inserted at point. Thus, moving forward one buffer position into the ffi would show f followed by a thin bar cursor followed by the fi; moving forward one more buffer position would show ff followed by a thin bar cursor followed by i. Etc. > If Harfbuzz could tell us more precisely how to divide the glyph into > subglyphs, we could do a better job, of course. I don't think it's possible because AFAIK fonts don't store this information. It should be possible, of course, to have a private database of such offsets, but I don't really see how it could work in general. Maybe I'm missing something, though. If someone wants to have a definitive answer, I suggest to ask on the HarfBuzz mailing list. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 17:18 ` Ligatures Eli Zaretskii @ 2020-05-18 19:19 ` Pip Cet 2020-05-18 19:25 ` Ligatures tomas ` (2 more replies) 0 siblings, 3 replies; 145+ messages in thread From: Pip Cet @ 2020-05-18 19:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel On Mon, May 18, 2020 at 5:18 PM Eli Zaretskii <eliz@gnu.org> wrote: > > So, maybe we don't need very much info: all we need is a boolean which > > tells us whether the glyph should be treated atomically or not. > > When not treating it atomically, we would (somewhat arbitrarily) divide > > the glyph horizontally into N equal sized "subglyphs" and draw the > > cursor on the corresponding subglyph. > > That strikes me as not a very user-friendly UX. Especially if you > keep in mind that glyphs can be composed into a grapheme cluster using > 2D offsets, not just left-right one-dimensional offsets. So such clusters would be marked as atomic? I like Stefan's proposal, and maybe it's what LibreOffice actually does: at large font sizes, the horizontal division of "subglyphs" seems off. > An alternative which might be nicer is to "split" the composition: > display it as if a ZWNJ character was inserted at point. Thus, moving > forward one buffer position into the ffi would show f followed by a thin bar > cursor followed by the fi; moving forward one more buffer position > would show ff followed by a thin bar cursor followed by i. Etc. I tried something like that (with a variable-pitch font), and the effect is nauseating because the rest of the line shifts as the width of the word at point changes. What I tried was to use Harfbuzz to shape entire words when PT is not in them, then split them up into individual characters (the way it's done now) when PT enters them. Of course, people might still like it. > > If Harfbuzz could tell us more precisely how to divide the glyph into > > subglyphs, we could do a better job, of course. > > I don't think it's possible because AFAIK fonts don't store this > information. Well, they should! > It should be possible, of course, to have a private > database of such offsets, but I don't really see how it could work in > general. And this is where it gets back to "let's not hardcode the dependency on Harfbuzz and FreeType, because other backends might actually give us the information we need". ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:19 ` Ligatures Pip Cet @ 2020-05-18 19:25 ` tomas 2020-05-18 19:41 ` Ligatures Pip Cet 2020-05-18 19:33 ` Ligatures Eli Zaretskii 2020-05-18 19:38 ` Ligatures Clément Pit-Claudel 2 siblings, 1 reply; 145+ messages in thread From: tomas @ 2020-05-18 19:25 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 406 bytes --] On Mon, May 18, 2020 at 07:19:19PM +0000, Pip Cet wrote: [...] > And this is where it gets back to "let's not hardcode the dependency > on Harfbuzz and FreeType, because other backends might actually give > us the information we need". But how should a backend guess where the subparts of a cluster are without the font providing it? And in the latter case, HarfBuzz does give us the info. Cheers -- t [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:25 ` Ligatures tomas @ 2020-05-18 19:41 ` Pip Cet 2020-05-18 20:20 ` Ligatures tomas 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-18 19:41 UTC (permalink / raw) To: tomas; +Cc: emacs-devel On Mon, May 18, 2020 at 7:27 PM <tomas@tuxteam.de> wrote: > > And this is where it gets back to "let's not hardcode the dependency > > on Harfbuzz and FreeType, because other backends might actually give > > us the information we need". > > But how should a backend guess where the subparts of a cluster are > without the font providing it? Well, of course it shouldn't. It should return the information that is available, and then we can decide, based on a user setting, what we want to do about it: the options are, at least, to treat the ligature as atomic (the right thing to do for ligatures like %, &, and ß), guess (possibly the right thing to do for ffi?), or refuse to use the ligature in question and fall back to individual characters (which isn't always possible, but it is what we do right now for ASCII ligatures). > And in the latter case, HarfBuzz > does give us the info. How so? I honestly don't think it does, because it would treat the ligature as one glyph. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:41 ` Ligatures Pip Cet @ 2020-05-18 20:20 ` tomas 0 siblings, 0 replies; 145+ messages in thread From: tomas @ 2020-05-18 20:20 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 603 bytes --] On Mon, May 18, 2020 at 07:41:19PM +0000, Pip Cet wrote: > On Mon, May 18, 2020 at 7:27 PM <tomas@tuxteam.de> wrote: [...] > > But how should a backend guess where the subparts of a cluster are > > without the font providing it? > > Well, of course it shouldn't. It should return the information that is > available [...] > > And in the latter case, HarfBuzz > > does give us the info. > > How so? I honestly don't think it does, because it would treat the > ligature as one glyph. Eli and Clément already looked it up for us: hb_ot_layout_get_ligature_carets() Cheers -- t [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:19 ` Ligatures Pip Cet 2020-05-18 19:25 ` Ligatures tomas @ 2020-05-18 19:33 ` Eli Zaretskii 2020-05-18 19:44 ` Ligatures Clément Pit-Claudel 2020-05-18 19:38 ` Ligatures Clément Pit-Claudel 2 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-18 19:33 UTC (permalink / raw) To: Pip Cet; +Cc: monnier, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Mon, 18 May 2020 19:19:19 +0000 > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org > > > An alternative which might be nicer is to "split" the composition: > > display it as if a ZWNJ character was inserted at point. Thus, moving > > forward one buffer position into the ffi would show f followed by a thin bar > > cursor followed by the fi; moving forward one more buffer position > > would show ff followed by a thin bar cursor followed by i. Etc. > > I tried something like that (with a variable-pitch font), and the > effect is nauseating because the rest of the line shifts as the width > of the word at point changes. The idea is that this is used only rarely. Most use cases don't need to deconstruct a ligature that way; after all, that's what ligatures are for. > And this is where it gets back to "let's not hardcode the dependency > on Harfbuzz and FreeType, because other backends might actually give > us the information we need". You cannot avoid hardcoding something, because each shaper has its idiosyncrasies. But those are only limited to the implementation of the font driver interfaces described in font.h, they don't leak above that level. So if we will support such sub-glyph movements, we will probably introduce one more method into the font driver interface, and the display engine will use that. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:33 ` Ligatures Eli Zaretskii @ 2020-05-18 19:44 ` Clément Pit-Claudel 2020-05-19 2:25 ` Ligatures Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-18 19:44 UTC (permalink / raw) To: emacs-devel On 18/05/2020 15.33, Eli Zaretskii wrote: >> From: Pip Cet <pipcet@gmail.com> >> Date: Mon, 18 May 2020 19:19:19 +0000 >> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org >> >>> An alternative which might be nicer is to "split" the composition: >>> display it as if a ZWNJ character was inserted at point. Thus, moving >>> forward one buffer position into the ffi would show f followed by a thin bar >>> cursor followed by the fi; moving forward one more buffer position >>> would show ff followed by a thin bar cursor followed by i. Etc. >> I tried something like that (with a variable-pitch font), and the >> effect is nauseating because the rest of the line shifts as the width >> of the word at point changes. > The idea is that this is used only rarely. Most use cases don't need > to deconstruct a ligature that way; after all, that's what ligatures > are for. In an earlier thread, you mentioned programming font ligatures — wouldn't it be very common to deconstruct such ligatures, like → into ->? Maybe the effect wouldn't be jarring with monospaced fonts, but for these the simple approach of subdividing the glyph works nicely too. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:44 ` Ligatures Clément Pit-Claudel @ 2020-05-19 2:25 ` Eli Zaretskii 2020-05-19 2:44 ` Ligatures Clément Pit-Claudel 2020-05-19 3:47 ` Ligatures Stefan Monnier 0 siblings, 2 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 2:25 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Mon, 18 May 2020 15:44:01 -0400 > > > The idea is that this is used only rarely. Most use cases don't need > > to deconstruct a ligature that way; after all, that's what ligatures > > are for. > > In an earlier thread, you mentioned programming font ligatures — wouldn't it be very common to deconstruct such ligatures, like → into ->? No, I don't think so. Why would this be common? > Maybe the effect wouldn't be jarring with monospaced fonts, but for these the simple approach of subdividing the glyph works nicely too. It might work in some simple cases, but I wonder what gains would that give the users. It sounds very unusual to me to do something like that, and I don't think we ever heard any such complaints until now, although prettify-symbols-mode exists for several years. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 2:25 ` Ligatures Eli Zaretskii @ 2020-05-19 2:44 ` Clément Pit-Claudel 2020-05-19 13:59 ` Ligatures Eli Zaretskii 2020-05-19 3:47 ` Ligatures Stefan Monnier 1 sibling, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-19 2:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 18/05/2020 22.25, Eli Zaretskii wrote: >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Mon, 18 May 2020 15:44:01 -0400 >> >>> The idea is that this is used only rarely. Most use cases don't need >>> to deconstruct a ligature that way; after all, that's what ligatures >>> are for. >> >> In an earlier thread, you mentioned programming font ligatures — wouldn't it be very common to deconstruct such ligatures, like → into ->? > > No, I don't think so. Why would this be common? I thought it would be the default. Emacs shows →, and you can put the point either before (|→), in the middle (-|>), or after (→|). This is what prettify-symbols-unprettify-at-point exists for, I believe, though it doesn't work perfectly often the composed glyph doesn't have the same width as the non-composed one. Here's a fairly common case: when writing html or XML, you may type <, then >, then press C-b and type the tag name; or you may use < and a paredit-like setup that inserts the > automatically. If the font has a ligature for <> and you can't put the point in the middle, this breaks. Same for || — the notation |x| { … } is used for lambdas in some languages; if you type || then try to move the point back inside the composed || glyph it won't work. >> Maybe the effect wouldn't be jarring with monospaced fonts, but for these the simple approach of subdividing the glyph works nicely too. > > It might work in some simple cases, but I wonder what gains would that > give the users. It sounds very unusual to me to do something like > that, and I don't think we ever heard any such complaints until now, > although prettify-symbols-mode exists for several years. I thought I did complain in the past, but I can't find the thread any more :/ prettify-symbols-unprettify-at-point helps, and it's the default in some popular Emacs configs. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 2:44 ` Ligatures Clément Pit-Claudel @ 2020-05-19 13:59 ` Eli Zaretskii 2020-05-19 14:35 ` Ligatures Clément Pit-Claudel 2020-05-19 15:36 ` Ligatures Tassilo Horn 0 siblings, 2 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 13:59 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Mon, 18 May 2020 22:44:27 -0400 > > >> In an earlier thread, you mentioned programming font ligatures — wouldn't it be very common to deconstruct such ligatures, like → into ->? > > > > No, I don't think so. Why would this be common? > > I thought it would be the default. Emacs shows →, and you can put the point either before (|→), in the middle (-|>), or after (→|). Doesn't sound as a useful default to me. It could be an optional feature, though. > Here's a fairly common case: when writing html or XML, you may type <, then >, then press C-b and type the tag name; or you may use < and a paredit-like setup that inserts the > automatically. If the font has a ligature for <> and you can't put the point in the middle, this breaks. Same for || — the notation |x| { … } is used for lambdas in some languages; if you type || then try to move the point back inside the composed || glyph it won't work. Sounds like a bug or misfeature that needs a solution, not necessarily the one that's been proposed here. For example, how about a special insert command that would disable ligation with the character it inserts? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 13:59 ` Ligatures Eli Zaretskii @ 2020-05-19 14:35 ` Clément Pit-Claudel 2020-05-19 15:21 ` Ligatures Eli Zaretskii 2020-05-19 15:36 ` Ligatures Tassilo Horn 1 sibling, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-19 14:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 19/05/2020 09.59, Eli Zaretskii wrote: >> Cc: emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Mon, 18 May 2020 22:44:27 -0400 >> >>>> In an earlier thread, you mentioned programming font ligatures — wouldn't it be very common to deconstruct such ligatures, like → into ->? >>> >>> No, I don't think so. Why would this be common? >> >> I thought it would be the default. Emacs shows →, and you can put the point either before (|→), in the middle (-|>), or after (→|). > > Doesn't sound as a useful default to me. It could be an optional > feature, though. Do we know of other editors that support ligatures but chose not to support moving through a composed character? If not, that would be a fairly strong signal that it's a reasonable default, I'd expect. >> Here's a fairly common case: when writing html or XML, you may type <, then >, then press C-b and type the tag name; or you may use < and a paredit-like setup that inserts the > automatically. If the font has a ligature for <> and you can't put the point in the middle, this breaks. Same for || — the notation |x| { … } is used for lambdas in some languages; if you type || then try to move the point back inside the composed || glyph it won't work. > > Sounds like a bug or misfeature that needs a solution, not necessarily > the one that's been proposed here. Possibly! But the feature discussed here seems to fit the bill pretty perfectly, so … > For example, how about a special > insert command that would disable ligation with the character it > inserts? Would that command be called automatically, or would it require a different input? I don't think Emacs can guess whether it should enable or disable ligation, so I imagine you mean different input, but that doesn't sound pleasant to use, so maybe I'm misunderstanding? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 14:35 ` Ligatures Clément Pit-Claudel @ 2020-05-19 15:21 ` Eli Zaretskii 2020-05-19 15:44 ` Ligatures Clément Pit-Claudel 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 15:21 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Tue, 19 May 2020 10:35:50 -0400 > > > Doesn't sound as a useful default to me. It could be an optional > > feature, though. > > Do we know of other editors that support ligatures but chose not to support moving through a composed character? If not, that would be a fairly strong signal that it's a reasonable default, I'd expect. OTOH, the current default exists since Emacs 21, so it sounds like a reasonable default as well. And I don't think arguing about defaults in Emacs is useful, because changing the default if you don't like it is easy. We do change the default behavior slowly, though. (And please note that we are talking about defaults for a feature that doesn't yet exist, which makes this dispute even less useful.) > > For example, how about a special > > insert command that would disable ligation with the character it > > inserts? > > Would that command be called automatically, or would it require a different input? You'd invoke it when you either know in advance you don't want the next character to ligate, or after you saw the ligature to disable the ligation for the sequence at or before point. > I don't think Emacs can guess whether it should enable or disable ligation, so I imagine you mean different input, but that doesn't sound pleasant to use, so maybe I'm misunderstanding? Emacs cannot, but the user can. Thus a separate command. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 15:21 ` Ligatures Eli Zaretskii @ 2020-05-19 15:44 ` Clément Pit-Claudel 2020-05-19 16:15 ` Ligatures Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-19 15:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 19/05/2020 11.21, Eli Zaretskii wrote: > And I don't think arguing about defaults in Emacs is useful, because > changing the default if you don't like it is easy. We do change the > default behavior slowly, though. I see this argument often (changing settings is easy), but I don't find it very convincing: in my experience, even after years of using Emacs, figuring which variable controls a given behavior, if there is even such a variable, is usually not easy: it requires reading manuals, guessing the right keywords, and often stepping through function implementations. It's quite a bit easier in Emacs than in other editors, but still not easy at all. >>> For example, how about a special >>> insert command that would disable ligation with the character it >>> inserts? >> >> Would that command be called automatically, or would it require a different input? > > You'd invoke it when you either know in advance you don't want the > next character to ligate, or after you saw the ligature to disable the > ligation for the sequence at or before point. That assumes that I know whether inserting a character will introduce a ligation, but I usually don't. I can't keep in my head a list of all the ligatures that my font supports, so I'm bound to be surprised from time to time (besides, this is very contextual. When I write a language where /\ and \/ are used to mean "and" and "or", I think of it when I type a / or a \. But when I'm in a context where /…/ is used to delimit regular expressions and \ is used to escape a character, I don't think of the \/ ligature. >> I don't think Emacs can guess whether it should enable or disable ligation, so I imagine you mean different input, but that doesn't sound pleasant to use, so maybe I'm misunderstanding? > > Emacs cannot, but the user can. Thus a separate command. I don't think that will work, but maybe I'm missing something. How does this work if I open a file that already has a ligature and I want to modify it? Do I have to explicitly break the ligature before I can edit it? More importantly, though, I don't understand what problem it would solve, at least in the context of programming ligatures. What is the problem with allowing cursor movement through ligatures like → for ->? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 15:44 ` Ligatures Clément Pit-Claudel @ 2020-05-19 16:15 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 16:15 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Tue, 19 May 2020 11:44:31 -0400 > > > You'd invoke it when you either know in advance you don't want the > > next character to ligate, or after you saw the ligature to disable the > > ligation for the sequence at or before point. > > That assumes that I know whether inserting a character will > introduce a ligation, but I usually don't. [...] Did you miss the part after "or after"? > I don't think that will work, but maybe I'm missing something. How does this work if I open a file that already has a ligature and I want to modify it? Do I have to explicitly break the ligature before I can edit it? "M-x toggle-ligature-mode RET", perhaps? Or go to the ligature you want to edit and invoke that command I mentioned above (after "or after")? > More importantly, though, I don't understand what problem it would solve, at least in the context of programming ligatures. What is the problem with allowing cursor movement through ligatures like → for ->? It doesn't feel right to me, and it goes against what Emacs did for the past 20 years. But that's me. But again, this is a purely academic argument. Ligature support in Emacs is not yet ready for prime time, the sub-glyph cursor motion needs to be implemented in the display engine, and only after that it would make sense arguing about the defaults of this imaginary mode. Let's not finish arguing now, lest we will have nothing to argue about then, okay? ;-) ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 13:59 ` Ligatures Eli Zaretskii 2020-05-19 14:35 ` Ligatures Clément Pit-Claudel @ 2020-05-19 15:36 ` Tassilo Horn 2020-05-19 16:08 ` Ligatures Eli Zaretskii 2020-05-19 16:14 ` Ligatures Stefan Monnier 1 sibling, 2 replies; 145+ messages in thread From: Tassilo Horn @ 2020-05-19 15:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Clément Pit-Claudel, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1835 bytes --] Eli Zaretskii <eliz@gnu.org> writes: >> > it be very common to deconstruct such ligatures, like → into ->? >> > >> > No, I don't think so. Why would this be common? >> >> I thought it would be the default. Emacs shows →, and you can put the >> point either before (|→), in the middle (-|>), or after (→|). > > Doesn't sound as a useful default to me. It could be an optional > feature, though. To me it sounds like a good default. >> Here's a fairly common case: when writing html or XML, you may type >> <, then >, then press C-b and type the tag name; or you may use < and >> a paredit-like setup that inserts the > automatically. If the font >> has a ligature for <> and you can't put the point in the middle, this >> breaks. Same for || — the notation |x| { … } is used for lambdas in >> some languages; if you type || then try to move the point back inside >> the composed || glyph it won't work. > > Sounds like a bug or misfeature that needs a solution, not necessarily > the one that's been proposed here. For example, how about a special > insert command that would disable ligation with the character it > inserts? I use the attached self-written ligature.el (Eli, you've helped me with that some months back). That's all nice but sometimes I too have the problem that I want to edit the name of a "private" function/variable foo--do-stuff and cannot move point inside the double-dash because it is composed as one char. As a little cure, I disable ligatures in the minibuffer where I absolutely need to do completion stuff like foo-<TAB>-bar. Another case is where when inserting < automatically inserts > immediately giving a <> diamond where I cannot move into. A special insert command will not help here because it is already inserted. Bye, Tassilo [-- Attachment #2: ligature.el --] [-- Type: text/plain, Size: 3251 bytes --] (defgroup ligature nil "Support for font ligatures" :version "28.1" :prefix "ligature-") (defcustom ligature-arrows (list "-->" "<!--" "->>" "<<-" "->" "<-" "<-<" ">>-" ">-" "<~>" "-<" "-<<" "<=>" "=>" "<=<" "<<=" "<==" "<==>" "==>" "=>>" ">=>" ">>=" "<-|" "<=|" "|=>" "|->" "<~~" "<~" "~~>" "~>" "<->") "Arrow ligatures." :type '(repeat string)) (defcustom ligature-misc (list "..<" "~-" "-~" "~@" "-|" "_|_" "|-" "||-" "|=" "||=" ".?" "?=" "<|>" "<:" ":<" ":>" ">:" ".=" ".-" "__" "<<<" ">>>" "<<" ">>" "~~" "<$>" "<$" "$>" "<+>" "<+" "+>" "<*>" "<*" "*>" "</" "</>" "/>" "|}" "{|" "[<" ">]" ":?>" ":?" "[||]" "?:" "?." "|>" "<|" "||>" "<||" "|||>" "<|||::=" "|]" "[|" "#{" "#[" "]#" "#(" "#?" "#_" "#_(" "#:" "#!" "#=") "Miscellaneous ligatures." :type '(repeat string)) (defcustom ligature-relations (list "==" "!=" "<=" ">=" "=:=" "!==" "===" "<>" "/==" "=!=" "=/=" "~=" ":=" "/=" "^=") "Relation ligatures." :type '(repeat string)) (defcustom ligature-operators (list "&&" "&&&" "||" "++" "--" "!!" "::" "+++" "??" ":::" "***" "---" "/\\" "\\/") "Operator ligatures." :type '(repeat string)) (defcustom ligature-comments-c-like (list "//" "///" "/**" "/*" "*/") "Ligatures for comments in C-like languages." :type '(repeat string)) (defcustom ligature-comments-xml-like (list "<!--" "-->") "Ligatures for comments in XML-like languages." :type '(repeat string)) (defcustom ligature-hashes (list "##" "###" "####") "Ligatures for comments in languages with # being the comment character." :type '(repeat string)) (defcustom ligature-dots (list "..." "..") "Dot ligatures." :type '(repeat string)) (defcustom ligature-semicolons (list ";;" ";;;") "Ligatures for comments in lisp languages." :type '(repeat string)) (defun ligature--get-all () (append ligature-arrows ligature-relations ligature-operators ligature-misc ligature-dots ligature-comments-c-like ligature-comments-xml-like ligature-hashes ligature-semicolons)) (defun ligature--apply (ligatures) (let ((groups (seq-group-by #'string-to-char ligatures))) (dolist (group groups) (let ((c (car group)) (rx (regexp-opt (mapcar (lambda (s) (substring s 1)) (cdr group))))) (set-char-table-range composition-function-table c `([,(concat "." rx) 0 compose-gstring-for-graphic])))))) (define-minor-mode ligature-minor-mode "A mode for font ligatures." nil "" nil (if ligature-minor-mode (progn (when (minibufferp) (error "Cannot use ligature-minor-mode in minibuffer")) ;; FIXME: This doesn't work. When enabled, there will be a local ;; variable but the global value is the same (and also includes the ;; ligature composition rules). (ligature--apply (ligature--get-all))) ;; FIXME: Even if the above worked, this could remove much more than this ;; mode added itself. (kill-local-variable 'composition-function-table))) (defun ligature-minor-mode--apply-if-possible () (unless (minibufferp) (ligature-minor-mode))) (define-globalized-minor-mode global-ligature-minor-mode ligature-minor-mode ligature-minor-mode--apply-if-possible) (provide 'ligature) ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 15:36 ` Ligatures Tassilo Horn @ 2020-05-19 16:08 ` Eli Zaretskii 2020-05-19 16:14 ` Ligatures Stefan Monnier 1 sibling, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 16:08 UTC (permalink / raw) To: Tassilo Horn; +Cc: cpitclaudel, emacs-devel > From: Tassilo Horn <tsdh@gnu.org> > Cc: Clément Pit-Claudel <cpitclaudel@gmail.com>, > emacs-devel@gnu.org > Date: Tue, 19 May 2020 17:36:44 +0200 > > I use the attached self-written ligature.el (Eli, you've helped me with > that some months back). That's all nice but sometimes I too have the > problem that I want to edit the name of a "private" function/variable > foo--do-stuff and cannot move point inside the double-dash because it is > composed as one char. As a little cure, I disable ligatures in the > minibuffer where I absolutely need to do completion stuff like > foo-<TAB>-bar. > > Another case is where when inserting < automatically inserts > > immediately giving a <> diamond where I cannot move into. Yes, the user-level (and perhaps also some infrastructure level) of support for ligatures is not yet ready. There's a TODO item for that, patches are welcome. > A special insert command will not help here because it is already > inserted. Then maybe we need both a command to insert a character without ligation, and a command to disassemble a ligature at point. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 15:36 ` Ligatures Tassilo Horn 2020-05-19 16:08 ` Ligatures Eli Zaretskii @ 2020-05-19 16:14 ` Stefan Monnier 1 sibling, 0 replies; 145+ messages in thread From: Stefan Monnier @ 2020-05-19 16:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Clément Pit-Claudel, emacs-devel >> Doesn't sound as a useful default to me. It could be an optional >> feature, though. > To me it sounds like a good default. For `->` and `ffi` it sounds good, indeed. For prettify-symbol-mode's combining of `lambda` into `λ`, OTOH that would be rather undesirable. Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 2:25 ` Ligatures Eli Zaretskii 2020-05-19 2:44 ` Ligatures Clément Pit-Claudel @ 2020-05-19 3:47 ` Stefan Monnier 2020-05-19 4:51 ` Ligatures Clément Pit-Claudel 1 sibling, 1 reply; 145+ messages in thread From: Stefan Monnier @ 2020-05-19 3:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Clément Pit-Claudel, emacs-devel > It might work in some simple cases, but I wonder what gains would that > give the users. It sounds very unusual to me to do something like > that, and I don't think we ever heard any such complaints until now, > although prettify-symbols-mode exists for several years. For things like `→`, I think of `->` as an "encoding" used to stay within the confines of ASCII whereas `→` is what is really "meant". So when I see `→` I'm not likely to want to "look inside" and am instead happy if `C-p` skips over both characters at once (except when I want to change it to `=>`, of course). In contrast I don't think of "ffi" as the ASCII encoding of `ffi`. Instead I think of `ffi` as just a more refined way to draw "ffi" and I'd find it odd for `C-p` to skip over those three chars. So, the right behavior depends on the intention, AFAICT. Since 99.99% of my Emacs windows is made up of monospace text, I probably won't be too significantly affected either way. Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 3:47 ` Ligatures Stefan Monnier @ 2020-05-19 4:51 ` Clément Pit-Claudel 0 siblings, 0 replies; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-19 4:51 UTC (permalink / raw) To: Stefan Monnier, Eli Zaretskii; +Cc: emacs-devel On 18/05/2020 23.47, Stefan Monnier wrote: > (except when I want > to change it to `=>`, of course). Variants of this case are not too uncommon, and they're not always as simple as removing the beginning of the composition to replace it with something else. For example, I'm typing a regexp in javascript, enclosed in /…/; then I add a backslash at the end of the regexp to escape a character that I haven't typed yet, and \/ turns into a composition, and the point disappears. Or I write html, with a buffer that contains <a href>, I type an = sign after the href, and => gets composed into ⇒, and the point disappears. There are many such examples, and if I lose my position, I need to delete part of the composition. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:19 ` Ligatures Pip Cet 2020-05-18 19:25 ` Ligatures tomas 2020-05-18 19:33 ` Ligatures Eli Zaretskii @ 2020-05-18 19:38 ` Clément Pit-Claudel 2020-05-19 14:55 ` Ligatures Pip Cet 2 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-18 19:38 UTC (permalink / raw) To: emacs-devel On 18/05/2020 15.19, Pip Cet wrote: > So such clusters would be marked as atomic? I like Stefan's proposal, > and maybe it's what LibreOffice actually does: at large font sizes, > the horizontal division of "subglyphs" seems off. Yup, that's what Firefox and LibreOffice do. >>> If Harfbuzz could tell us more precisely how to divide the glyph into >>> subglyphs, we could do a better job, of course. >> >> I don't think it's possible because AFAIK fonts don't store this >> information. > > Well, they should! They can, but few do (the LigatureCaretList subtable within the GDEF table) >> It should be possible, of course, to have a private >> database of such offsets, but I don't really see how it could work in >> general. > > And this is where it gets back to "let's not hardcode the dependency > on Harfbuzz and FreeType, because other backends might actually give > us the information we need". Harfbuzz can give us this info: hb_ot_layout_get_ligature_carets ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:38 ` Ligatures Clément Pit-Claudel @ 2020-05-19 14:55 ` Pip Cet 2020-05-19 15:30 ` Ligatures Clément Pit-Claudel 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-19 14:55 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel On Mon, May 18, 2020 at 7:40 PM Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: > > And this is where it gets back to "let's not hardcode the dependency > > on Harfbuzz and FreeType, because other backends might actually give > > us the information we need". > > Harfbuzz can give us this info: hb_ot_layout_get_ligature_carets Thanks, I hadn't looked there! So Harfbuzz provides a non-core API which, after a separate call for each cluster, allows us to split up a glyph into non-overlapping bounding boxes of the same height (the information returned is one-dimensional, and intended for carets, not for Emacs-style box cursors). I don't see how that API design is so great we should hardcode dependencies on it, though I do agree it's sufficient to work with. Again, this isn't about some exotic use case: I open a buffer, type "ffi", and hit C-b twice. What should happen? AFAIU, people are still seriously considering the possibility that all of "ffi" would be covered by the cursor. I hope I'm misunderstanding that, because it's so obviously the wrong thing to do in this case. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 14:55 ` Ligatures Pip Cet @ 2020-05-19 15:30 ` Clément Pit-Claudel 2020-05-19 15:52 ` Ligatures Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-19 15:30 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel On 19/05/2020 10.55, Pip Cet wrote: > On Mon, May 18, 2020 at 7:40 PM Clément Pit-Claudel > <cpitclaudel@gmail.com> wrote: >>> And this is where it gets back to "let's not hardcode the dependency >>> on Harfbuzz and FreeType, because other backends might actually give >>> us the information we need". >> >> Harfbuzz can give us this info: hb_ot_layout_get_ligature_carets > > Thanks, I hadn't looked there! > > So Harfbuzz provides a non-core API which, after a separate call for > each cluster, allows us to split up a glyph into non-overlapping > bounding boxes of the same height (the information returned is > one-dimensional, and intended for carets, not for Emacs-style box > cursors). Are you worried about the height of the box? For the width part, isn't it just the difference between two consecutive carets? > I don't see how that API design is so great we should hardcode > dependencies on it, though I do agree it's sufficient to work with. No opinions there ^^ ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 15:30 ` Ligatures Clément Pit-Claudel @ 2020-05-19 15:52 ` Pip Cet 0 siblings, 0 replies; 145+ messages in thread From: Pip Cet @ 2020-05-19 15:52 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1422 bytes --] On Tue, May 19, 2020 at 3:30 PM Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: > > So Harfbuzz provides a non-core API which, after a separate call for > > each cluster, allows us to split up a glyph into non-overlapping > > bounding boxes of the same height (the information returned is > > one-dimensional, and intended for carets, not for Emacs-style box > > cursors). > > Are you worried about the height of the box? For the width part, isn't it just the difference between two consecutive carets? That's what I'd work with, yeah. Perhaps I can make things a little clearer by attaching a screenshot of how things currently look with the "Linux Libertine Display O" font, which has especially prominent ligatures and overhangs (I guess it's somehow inspired by the operating system kernel it's named for?). I think there's plenty to be improved about that: use a ligature, sure, but also maybe get away from the "invert a box" style of drawing the cursor, or handle overhangs specially, or...something. But that would require an idea of which pixels belong to which (sub)glyphs (in the ligature). And caret positioning doesn't give us enough information to do that. Thank you again for pointing out that API! Whether it's a core feature of a shaper or a backend-dependent extra feature is a secondary concern, the important part is that it's there and we can do the right thing. [-- Attachment #2: ffi.jpg --] [-- Type: image/jpeg, Size: 1741 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 17:05 ` Ligatures Stefan Monnier 2020-05-18 17:18 ` Ligatures Eli Zaretskii @ 2020-05-18 17:24 ` tomas 2020-05-18 17:41 ` Ligatures Eli Zaretskii 2020-05-18 20:33 ` Ligatures Stefan Monnier 1 sibling, 2 replies; 145+ messages in thread From: tomas @ 2020-05-18 17:24 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 909 bytes --] On Mon, May 18, 2020 at 01:05:53PM -0400, Stefan Monnier wrote: > [ I know nothing about the underlying APIs and such, so speaking here > only as a random user. ] [...] > So, maybe we don't need very much info: all we need is a boolean which > tells us whether the glyph should be treated atomically or not. > When not treating it atomically, we would (somewhat arbitrarily) divide > the glyph horizontally into N equal sized "subglyphs" and draw the > cursor on the corresponding subglyph. I'm somewhat out of my depth here, but I have the hunch that some "ligatures" aren't "just stacked horizontally". > If Harfbuzz could tell us more precisely how to divide the glyph into > subglyphs, we could do a better job, of course. On a very superficial glance it seems they can [1] Cheers [1] https://github.com/harfbuzz/harfbuzz/blob/master/docs/usermanual-clusters.xml -- tomás [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 17:24 ` Ligatures tomas @ 2020-05-18 17:41 ` Eli Zaretskii 2020-05-18 19:07 ` Ligatures tomas 2020-05-18 20:33 ` Ligatures Stefan Monnier 1 sibling, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-18 17:41 UTC (permalink / raw) To: tomas; +Cc: emacs-devel > Date: Mon, 18 May 2020 19:24:41 +0200 > From: <tomas@tuxteam.de> > > > If Harfbuzz could tell us more precisely how to divide the glyph into > > subglyphs, we could do a better job, of course. > > On a very superficial glance it seems they can [1] > > Cheers > [1] https://github.com/harfbuzz/harfbuzz/blob/master/docs/usermanual-clusters.xml AFAIK, each "cluster" corresponds to a single font glyph, and we already get this information from HarfBuzz, see hbfont.c. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 17:41 ` Ligatures Eli Zaretskii @ 2020-05-18 19:07 ` tomas 2020-05-18 19:17 ` Ligatures Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: tomas @ 2020-05-18 19:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 744 bytes --] On Mon, May 18, 2020 at 08:41:09PM +0300, Eli Zaretskii wrote: > > Date: Mon, 18 May 2020 19:24:41 +0200 > > From: <tomas@tuxteam.de> > > > > > If Harfbuzz could tell us more precisely how to divide the glyph into > > > subglyphs, we could do a better job, of course. > > > > On a very superficial glance it seems they can [1] > > > > Cheers > > [1] https://github.com/harfbuzz/harfbuzz/blob/master/docs/usermanual-clusters.xml > > AFAIK, each "cluster" corresponds to a single font glyph, and we > already get this information from HarfBuzz, see hbfont.c. I see, thanks. As I said, my reading was a very cursory. I'm sure you read that doc much more thoroughly than me :-) Thanks for the insights Cheers -- tomás [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:07 ` Ligatures tomas @ 2020-05-18 19:17 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-18 19:17 UTC (permalink / raw) To: tomas; +Cc: emacs-devel > Date: Mon, 18 May 2020 21:07:35 +0200 > From: tomas@tuxteam.de > Cc: emacs-devel@gnu.org > > > > [1] https://github.com/harfbuzz/harfbuzz/blob/master/docs/usermanual-clusters.xml > > > > AFAIK, each "cluster" corresponds to a single font glyph, and we > > already get this information from HarfBuzz, see hbfont.c. > > I see, thanks. As I said, my reading was a very cursory. I'm sure > you read that doc much more thoroughly than me :-) Some of the docs is impossible to understand without asking the HarfBuzz developers (who are always willing to help). The HarfBuzz docs is really minimal. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 17:24 ` Ligatures tomas 2020-05-18 17:41 ` Ligatures Eli Zaretskii @ 2020-05-18 20:33 ` Stefan Monnier 1 sibling, 0 replies; 145+ messages in thread From: Stefan Monnier @ 2020-05-18 20:33 UTC (permalink / raw) To: tomas; +Cc: emacs-devel >> So, maybe we don't need very much info: all we need is a boolean which >> tells us whether the glyph should be treated atomically or not. >> When not treating it atomically, we would (somewhat arbitrarily) divide >> the glyph horizontally into N equal sized "subglyphs" and draw the >> cursor on the corresponding subglyph. > > I'm somewhat out of my depth here, but I have the hunch that some > "ligatures" aren't "just stacked horizontally". That's why we need a boolean to tell us whether this ligature is "stacked horizontally" (which I called "not atomic"). This boolean could actually be a global constant (so it give the wrong behavior half the time, but that would be good enough for those people who use the kind of latin-ligatures talked about here and almost no other ligatures, and would be no-worse than what we have now for people who do use languages where many ligatures aren't "stacked horizontally". But rather than a global constant, we could probably try and do better either by asking the font-backend (in case it can provide that kind of info) of by using a heuristic based on the script of the characters that are being combined. Obviously, I'm discussing a *heuristic*, not a 100% perfect solution. Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 16:08 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii 2020-05-18 16:45 ` tomas 2020-05-18 17:05 ` Ligatures Stefan Monnier @ 2020-05-18 17:31 ` Clément Pit-Claudel 2020-05-18 17:39 ` Eli Zaretskii 2020-05-19 10:09 ` Trevor Spiteri 2020-05-19 5:43 ` Ligatures ASSI 3 siblings, 2 replies; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-18 17:31 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1024 bytes --] On 18/05/2020 12.08, Eli Zaretskii wrote: > On second thought, I think I misunderstood you. If the font that is > used shows "ffi" as a _single_ glyph ffi, and LibreOffice indeed > highlights parts of this glyph, then I'd like to know how it does > that, and how far does this capability extend. I mean, what does it > do with ligatures like ae, displayed as æ -- does it highlight the > common vertical stroke for both parts? And what about "st", displayed > as st -- this has a curved "hand" connecting s and t -- to which of the > 2 does it belong for the purposes of highlighting? There's also "hv" > displayed as ƕ, let alone "fs" displayed as ẞ and "fz" displayed as > ß. I've attached a screenshot with a few examples, though I couldn't find a font that displays ae as æ. Firefox does the same as LibreOffice (try it here, for example: https://developer.mozilla.org/en-US/docs/Web/CSS/font-variant-ligatures). Since Firefox uses Harbuzz, I think there's a good chance we can support that feature too :) [-- Attachment #2: ligatures.png --] [-- Type: image/png, Size: 13362 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 17:31 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Clément Pit-Claudel @ 2020-05-18 17:39 ` Eli Zaretskii 2020-05-18 19:01 ` Clément Pit-Claudel 2020-05-19 10:09 ` Trevor Spiteri 1 sibling, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-18 17:39 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Mon, 18 May 2020 13:31:30 -0400 > > I've attached a screenshot with a few examples, though I couldn't find a font that displays ae as æ. Thanks. Once again, I wonder how they decide where each parts starts and ends. The examples show very simple cases, so it's hard to know where this ends. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 17:39 ` Eli Zaretskii @ 2020-05-18 19:01 ` Clément Pit-Claudel 2020-05-18 19:15 ` Eli Zaretskii ` (3 more replies) 0 siblings, 4 replies; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-18 19:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 18/05/2020 13.39, Eli Zaretskii wrote: >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Mon, 18 May 2020 13:31:30 -0400 >> >> I've attached a screenshot with a few examples, though I couldn't find a font that displays ae as æ. > > Thanks. Once again, I wonder how they decide where each parts starts > and ends. The examples show very simple cases, so it's hard to know > where this ends. Hi Eli, I asked on Firefox' Matrix server. Here is a lightly edited transcript: cpitclaudel> Hi all. I noticed that Firefox has this nifty feature that makes it possible to move the cursor within a ligature (for example, with the right font config, "ffi" can be rendered as "ffi" while allowing the cursor to move between the individual glyphs that make up that composition). Is the extraction of ligature information and the rendering done by Firefox itself, or by a lower-level library? Most font shaping libraries I've seen don't seem to return glyph-decomposition information for ligatures, so I'm curious to understand how Firefox does it ^^ jfkthame> Firefox uses harfbuzz to handle the font shaping (ligature rules, etc). I'd expect what you describe to work pretty much the same in other browsers too, fwiw. krosylight cpitclaudel> Thanks! But Harfbuzz doesn't give sub-glyph information for ligatures, does it? So how does Firefox know where to put the caret when it moves through a ligature? jfkthame> it doesn't, really - it just knows how many underlying characters are represented by the ligature glyph, and divides the advance width up into that many slices (usually that works pretty reasonably, but it's possible to come up with fonts where the inaccuracy becomes obvious) jfkthame> In principle, OpenType fonts can provide specific positions for the caret within a ligature (see the LigatureCaretList subtable within the GDEF table), but in practice that's rarely supported or used (harfbuzz can provide this information if it's present, see the hb_ot_layout_get_ligature_carets function, but currently firefox doesn't use it anyhow) cpitclaudel> Thanks, that's very useful! How does that work for glyphs like "fs" displayed as ẞ or "fz" displayed as ß? Does Firefox move in that single glyph? (I couldn't find a font that does that, otherwise I'd have tested it ^^) Thanks a lot for your help :) jfkthame> Yes, it'd be the same - doesn't matter what the specific characters are, if there's a ligature of two characters Firefox would put the caret half-way through the ligature glyph when it is between the component characters in the underlying text jfkthame> btw, if you're on a mac (or have access to one), you can see an extreme case if you try the word "Zapfino" in the font Zapfino .... the entire word is a single 7-character ligature, and the seven equal slices that Firefox treats it as for selection/editing purposes don't match up to the visual shapes of the sub-glyphs at all well HTH, Clément. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 19:01 ` Clément Pit-Claudel @ 2020-05-18 19:15 ` Eli Zaretskii 2020-05-18 19:18 ` tomas ` (2 subsequent siblings) 3 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-18 19:15 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Mon, 18 May 2020 15:01:49 -0400 > > I asked on Firefox' Matrix server. Here is a lightly edited transcript: Thanks. So it's pure heuristic, and works only in simple cases. We could ask on the HarfBuzz list how many fonts provide meaningful information for the hb_ot_layout_get_ligature_carets function to return useful data. If someone is interested in working on that, that is. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 19:01 ` Clément Pit-Claudel 2020-05-18 19:15 ` Eli Zaretskii @ 2020-05-18 19:18 ` tomas 2020-05-18 20:37 ` Ligatures Stefan Monnier 2020-05-18 21:59 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Alan Third 3 siblings, 0 replies; 145+ messages in thread From: tomas @ 2020-05-18 19:18 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 968 bytes --] On Mon, May 18, 2020 at 03:01:49PM -0400, Clément Pit-Claudel wrote: > On 18/05/2020 13.39, Eli Zaretskii wrote: > >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> > >> Date: Mon, 18 May 2020 13:31:30 -0400 > >> > >> I've attached a screenshot with a few examples, though I couldn't find a font that displays ae as æ. > > > > Thanks. Once again, I wonder how they decide where each parts starts > > and ends. The examples show very simple cases, so it's hard to know > > where this ends. > > Hi Eli, > > I asked on Firefox' Matrix server. Here is a lightly edited transcript: Thanks, that's interesting. So they just assume the subcharacters in a cluster stack side-by-side. Works most of the time, but is bound to give surprising results with things which stack the "wrong" way (i.e. on the top or bottom for LR or RL scripts, like accents and crazy scripts like Devanagari). Thanks for gathering the information. Cheers -- t [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 19:01 ` Clément Pit-Claudel 2020-05-18 19:15 ` Eli Zaretskii 2020-05-18 19:18 ` tomas @ 2020-05-18 20:37 ` Stefan Monnier 2020-05-18 21:59 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Alan Third 3 siblings, 0 replies; 145+ messages in thread From: Stefan Monnier @ 2020-05-18 20:37 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: Eli Zaretskii, emacs-devel > jfkthame> it doesn't, really - it just knows how many underlying characters > jfkthame> are represented by the ligature glyph, and divides the advance > jfkthame> width up into that many slices (usually that works pretty > jfkthame> reasonably, but it's possible to come up with fonts where the > jfkthame> inaccuracy becomes obvious) Apparently, great minds think alike ;-) Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 19:01 ` Clément Pit-Claudel ` (2 preceding siblings ...) 2020-05-18 20:37 ` Ligatures Stefan Monnier @ 2020-05-18 21:59 ` Alan Third 2020-05-19 13:56 ` Eli Zaretskii 3 siblings, 1 reply; 145+ messages in thread From: Alan Third @ 2020-05-18 21:59 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: Eli Zaretskii, emacs-devel [-- Attachment #1: Type: text/plain, Size: 721 bytes --] On Mon, May 18, 2020 at 03:01:49PM -0400, Clément Pit-Claudel wrote: > jfkthame> btw, if you're on a mac (or have access to one), you can > see an extreme case if you try the word "Zapfino" in the font > Zapfino .... the entire word is a single 7-character ligature, and > the seven equal slices that Firefox treats it as for > selection/editing purposes don't match up to the visual shapes of > the sub-glyphs at all well In case anyone's interested, I've attached a screenshot of Apple's Pages.app displaying the word Zapfino with the cursor after the "a". Clearly not ideal. OTOH, if LibreOffice, Firefox, and even Apple's products do this, perhaps it's just the way people will expect it to be done. -- Alan Third [-- Attachment #2: Screenshot 2020-05-18 at 22.52.23.png --] [-- Type: image/png, Size: 28933 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 21:59 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Alan Third @ 2020-05-19 13:56 ` Eli Zaretskii 2020-05-19 14:39 ` Clément Pit-Claudel 2020-05-19 20:26 ` Alan Third 0 siblings, 2 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 13:56 UTC (permalink / raw) To: Alan Third; +Cc: cpitclaudel, emacs-devel > Date: Mon, 18 May 2020 23:59:11 +0200 (CEST) > From: Alan Third <alan@idiocy.org> > Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org > > In case anyone's interested, I've attached a screenshot of Apple's > Pages.app displaying the word Zapfino with the cursor after the "a". I don't see anything on or after "a", I see a thin vertical line on the "Z". is that what is actually displayed? If so, how do people know the cursor is after "a"?? > Clearly not ideal. OTOH, if LibreOffice, Firefox, and even Apple's > products do this, perhaps it's just the way people will expect it to > be done. If someone wants to work on such a feature, I'm sure it will be welcome by at least some of the users. Thanks. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-19 13:56 ` Eli Zaretskii @ 2020-05-19 14:39 ` Clément Pit-Claudel 2020-05-19 21:43 ` Pip Cet 2020-05-19 20:26 ` Alan Third 1 sibling, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-19 14:39 UTC (permalink / raw) To: Eli Zaretskii, Alan Third; +Cc: emacs-devel On 19/05/2020 09.56, Eli Zaretskii wrote: > I don't see anything on or after "a", I see a thin vertical line on > the "Z". is that what is actually displayed? If so, how do people > know the cursor is after "a"?? They don't: "the seven equal slices that Firefox treats it as for selection/editing purposes don't match up to the visual shapes of the sub-glyphs at all well" ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-19 14:39 ` Clément Pit-Claudel @ 2020-05-19 21:43 ` Pip Cet 2020-05-20 1:41 ` Clément Pit-Claudel ` (3 more replies) 0 siblings, 4 replies; 145+ messages in thread From: Pip Cet @ 2020-05-19 21:43 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: Eli Zaretskii, Alan Third, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1433 bytes --] On Tue, May 19, 2020 at 2:39 PM Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: > On 19/05/2020 09.56, Eli Zaretskii wrote: > > I don't see anything on or after "a", I see a thin vertical line on > > the "Z". is that what is actually displayed? If so, how do people > > know the cursor is after "a"?? > > They don't: "the seven equal slices that Firefox treats it as for selection/editing purposes don't match up to the visual shapes of the sub-glyphs at all well" And I'm afraid the difference is much more obvious with box cursors than it is with carets. I'm attaching a screenshot of a patched Emacs displaying "ffi", with point on the second f, in the "Linux Libertine Display O" font (using approximately equal slices). I think this is a bit of a worst-case scenario, a three-letter ligature in a font using ligatures and overhangs very enthusiastically. It might be okay for other fonts. My remaining idea is to stretch characters so we can break up a ligature without changing its total width. I'm not sure how to do that, though. (I'm also attaching the patch, for the morbidly curious; it isn't clean, readable, or finished in any way, and contains at least one obvious bug. It's just good enough to produce the screenshot, and maybe it can serve as a hint as to which files need changing for ligatures to work; but such changes would have to be done very differently from the patch.). [-- Attachment #2: ffi-box-cursor.png --] [-- Type: image/png, Size: 1067 bytes --] [-- Attachment #3: 0001-Ligatures.diff --] [-- Type: text/x-patch, Size: 21370 bytes --] diff --git a/src/alloc.c b/src/alloc.c index ebc55857ea..1395f647f4 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -322,7 +322,7 @@ #define PUREBEG (char *) pure /* If positive, garbage collection is inhibited. Otherwise, zero. */ -static intptr_t garbage_collection_inhibited; +static intptr_t garbage_collection_inhibited = 3; /* The GC threshold in bytes, the last time it was calculated from gc-cons-threshold and gc-cons-percentage. */ diff --git a/src/composite.c b/src/composite.c index 518502be49..e2bece40c8 100644 --- a/src/composite.c +++ b/src/composite.c @@ -836,7 +836,7 @@ fill_gstring_body (Lisp_Object gstring) LGLYPH_SET_CHAR (g, c); if (font != NULL) - code = font->driver->encode_char (font, LGLYPH_CHAR (g)); + code = font->driver->encode_char (font, LGLYPH_CHAR (g), NULL); else code = FONT_INVALID_CODE; if (code != FONT_INVALID_CODE) diff --git a/src/dispextern.h b/src/dispextern.h index 0b1f3d14ae..2f6b33e74c 100644 --- a/src/dispextern.h +++ b/src/dispextern.h @@ -397,6 +397,15 @@ #define SET_GLYPH_FROM_GLYPH_CODE(glyph, gc) \ }; +struct glyph_context +{ + union vectorlike_header header; + Lisp_Object string; + Lisp_Object position; + int i; + int n; +}; + /* Glyphs. Be extra careful when changing this structure! Esp. make sure that @@ -567,6 +576,8 @@ #define FACE_ID_BITS 20 /* Used to compare all bit-fields above in one step. */ unsigned val; } u; + + struct glyph_context *context; }; diff --git a/src/font.c b/src/font.c index ab00402b40..8de3c969b9 100644 --- a/src/font.c +++ b/src/font.c @@ -3010,7 +3010,7 @@ font_has_char (struct frame *f, Lisp_Object font, int c) if (result >= 0) return result; } - return (fontp->driver->encode_char (fontp, c) != FONT_INVALID_CODE); + return (fontp->driver->encode_char (fontp, c, NULL) != FONT_INVALID_CODE); } @@ -3023,7 +3023,7 @@ font_encode_char (Lisp_Object font_object, int c) eassert (FONT_OBJECT_P (font_object)); font = XFONT_OBJECT (font_object); - return font->driver->encode_char (font, c); + return font->driver->encode_char (font, c, NULL); } @@ -4418,7 +4418,7 @@ font_fill_lglyph_metrics (Lisp_Object glyph, struct font *font, unsigned int cod struct font_metrics metrics; LGLYPH_SET_CODE (glyph, code); - font->driver->text_extents (font, &code, 1, &metrics); + font->driver->text_extents (font, &code, 1, &metrics, NULL); LGLYPH_SET_LBEARING (glyph, metrics.lbearing); LGLYPH_SET_RBEARING (glyph, metrics.rbearing); LGLYPH_SET_WIDTH (glyph, metrics.width); @@ -4638,7 +4638,7 @@ DEFUN ("internal-char-font", Finternal_char_font, Sinternal_char_font, 1, 2, 0, struct face *face = FACE_FROM_ID (f, face_id); if (! face->font) return Qnil; - unsigned code = face->font->driver->encode_char (face->font, c); + unsigned code = face->font->driver->encode_char (face->font, c, NULL); if (code == FONT_INVALID_CODE) return Qnil; Lisp_Object font_object; @@ -4965,7 +4965,7 @@ DEFUN ("font-get-glyphs", Ffont_get_glyphs, Sfont_get_glyphs, 3, 4, 0, unsigned code; struct font_metrics metrics; - code = font->driver->encode_char (font, c); + code = font->driver->encode_char (font, c, NULL); if (code == FONT_INVALID_CODE) { ASET (vec, i, Qnil); @@ -4976,7 +4976,7 @@ DEFUN ("font-get-glyphs", Ffont_get_glyphs, Sfont_get_glyphs, 3, 4, 0, LGLYPH_SET_TO (g, i); LGLYPH_SET_CHAR (g, c); LGLYPH_SET_CODE (g, code); - font->driver->text_extents (font, &code, 1, &metrics); + font->driver->text_extents (font, &code, 1, &metrics, NULL); LGLYPH_SET_WIDTH (g, metrics.width); LGLYPH_SET_LBEARING (g, metrics.lbearing); LGLYPH_SET_RBEARING (g, metrics.rbearing); diff --git a/src/font.h b/src/font.h index 8614e7fa10..952a9fa4c3 100644 --- a/src/font.h +++ b/src/font.h @@ -565,6 +565,8 @@ #define FONT_PIXEL_SIZE_QUANTUM 1 #define FONT_INVALID_CODE 0xFFFFFFFF +struct glyph_context; + /* Font driver. Members specified as "optional" can be NULL. */ struct font_driver @@ -645,14 +647,15 @@ #define FONT_INVALID_CODE 0xFFFFFFFF /* Return a glyph code of FONT for character C (Unicode code point). If FONT doesn't have such a glyph, return FONT_INVALID_CODE. */ - unsigned (*encode_char) (struct font *font, int c); + unsigned (*encode_char) (struct font *font, int c, struct glyph_context *context); /* Compute the total metrics of the NGLYPHS glyphs specified by the font FONT and the sequence of glyph codes CODE, and store the result in METRICS. */ void (*text_extents) (struct font *font, const unsigned *code, int nglyphs, - struct font_metrics *metrics); + struct font_metrics *metrics, + struct glyph_context *context); #ifdef HAVE_WINDOW_SYSTEM diff --git a/src/ftcrfont.c b/src/ftcrfont.c index 7832d4f5ce..19c2644285 100644 --- a/src/ftcrfont.c +++ b/src/ftcrfont.c @@ -323,7 +323,7 @@ ftcrfont_has_char (Lisp_Object font, int c) } static unsigned -ftcrfont_encode_char (struct font *font, int c) +ftcrfont_encode_char (struct font *font, int c, struct glyph_context *context) { struct font_info *ftcrfont_info = (struct font_info *) font; unsigned code = FONT_INVALID_CODE; @@ -331,20 +331,53 @@ ftcrfont_encode_char (struct font *font, int c) int utf8len = CHAR_STRING (c, utf8); cairo_glyph_t stack_glyph; cairo_glyph_t *glyphs = &stack_glyph; - int num_glyphs = 1; - if (cairo_scaled_font_text_to_glyphs (ftcrfont_info->cr_scaled_font, 0, 0, - (char *) utf8, utf8len, - &glyphs, &num_glyphs, - NULL, NULL, NULL) - == CAIRO_STATUS_SUCCESS) + if (context == NULL) { - if (glyphs != &stack_glyph) - cairo_glyph_free (glyphs); - else if (stack_glyph.index) - code = stack_glyph.index; + context = xmalloc (sizeof *context); + context->string = CALLN (Fstring, make_fixnum (c)); + context->position = make_fixnum (0); } + unsigned int num_glyphs = 0; + unsigned int num_clusters = 0; + hb_buffer_t *hb_buf = hb_buffer_create (); + hb_buffer_set_cluster_level (hb_buf, HB_BUFFER_CLUSTER_LEVEL_MONOTONE_CHARACTERS); + hb_buffer_add_utf8 (hb_buf, SDATA (context->string), -1, 0, -1); + hb_buffer_set_direction (hb_buf, HB_DIRECTION_LTR); + hb_font_t *hb_font = hb_ft_font_create_referenced + (cairo_ft_scaled_font_lock_face (ftcrfont_info->cr_scaled_font)); + hb_shape (hb_font, hb_buf, NULL, 0); + hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos + (hb_buf, &num_glyphs); + hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions + (hb_buf, &num_glyphs); + int i0, i1; + int c0, c1; + i0 = 0; + for (int i = num_glyphs - 1; i >= 0; i--) + { + if (glyph_info[i].cluster <= XFIXNUM (context->position)) + { + i0 = i; + c0 = glyph_info[i].cluster; + break; + } + } + i1 = num_glyphs; + for (int i = 0; i < num_glyphs; i++) + { + if (glyph_info[i].cluster > c0) + { + i1 = i; + c1 = glyph_info[i].cluster; + break; + } + } + context->i = XFIXNUM (context->position) - c0; + context->n = c1 - c0; + code = glyph_info[i0].codepoint; + return code; } @@ -352,30 +385,65 @@ ftcrfont_encode_char (struct font *font, int c) ftcrfont_text_extents (struct font *font, const unsigned *code, int nglyphs, - struct font_metrics *metrics) + struct font_metrics *metrics, + struct glyph_context *context) { + struct font_info *ftcrfont_info = (struct font_info *) font; int width, i; block_input (); - width = ftcrfont_glyph_extents (font, code[0], metrics); - for (i = 1; i < nglyphs; i++) + + if (context == NULL) { - struct font_metrics m; - int w = ftcrfont_glyph_extents (font, code[i], metrics ? &m : NULL); + context = xmalloc (sizeof *context); + context->string = CALLN (Fstring, make_fixnum (code[0])); + context->position = make_fixnum (0); + } - if (metrics) + unsigned int num_glyphs = 0; + unsigned int num_clusters = 0; + hb_buffer_t *hb_buf = hb_buffer_create (); + hb_buffer_set_cluster_level (hb_buf, HB_BUFFER_CLUSTER_LEVEL_MONOTONE_CHARACTERS); + hb_buffer_set_direction (hb_buf, HB_DIRECTION_LTR); + hb_buffer_set_content_type (hb_buf, HB_BUFFER_CONTENT_TYPE_UNICODE); + int n = 0; + for (const char *p = SDATA (context->string); p <= SDATA (context->string) + SBYTES (context->string);) + { + int c = string_char_advance (&p); + hb_buffer_add (hb_buf, c, n++); + } + hb_font_t *hb_font = hb_ft_font_create_referenced + (cairo_ft_scaled_font_lock_face (ftcrfont_info->cr_scaled_font)); + hb_shape (hb_font, hb_buf, NULL, 0); + hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos + (hb_buf, &num_glyphs); + hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions + (hb_buf, &num_glyphs); + int i0, i1; + int c0, c1; + i0 = 0; + for (int i = num_glyphs - 1; i >= 0; i--) + { + if (glyph_info[i].cluster <= XFIXNUM (context->position)) + { + i0 = i; + c0 = glyph_info[i].cluster; + break; + } + } + i1 = num_glyphs; + for (int i = 0; i < num_glyphs; i++) + { + if (glyph_info[i].cluster > c0) { - if (width + m.lbearing < metrics->lbearing) - metrics->lbearing = width + m.lbearing; - if (width + m.rbearing > metrics->rbearing) - metrics->rbearing = width + m.rbearing; - if (m.ascent > metrics->ascent) - metrics->ascent = m.ascent; - if (m.descent > metrics->descent) - metrics->descent = m.descent; + i1 = i; + c1 = glyph_info[i].cluster; + break; } - width += w; } + context->i = XFIXNUM (context->position) - c0; + context->n = c1 - c0; + width = glyph_pos[i0].x_advance / (c1 - c0) / 64; unblock_input (); if (metrics) @@ -508,6 +576,8 @@ ftcrfont_draw (struct glyph_string *s, glyphs[i].index = s->char2b[from + i]; glyphs[i].x = x; glyphs[i].y = y; + struct glyph_context *context = s->first_glyph->context; + glyphs[i].x -= (context->i * s->width); x += (s->padding_p ? 1 : ftcrfont_glyph_extents (s->font, glyphs[i].index, NULL)); diff --git a/src/hbfont.c b/src/hbfont.c index 576c5fe7f6..5c3c690281 100644 --- a/src/hbfont.c +++ b/src/hbfont.c @@ -578,7 +578,7 @@ hbfont_shape (Lisp_Object lgstring, Lisp_Object direction) LGLYPH_SET_CODE (lglyph, info[i].codepoint); unsigned code = info[i].codepoint; - font->driver->text_extents (font, &code, 1, &metrics); + font->driver->text_extents (font, &code, 1, &metrics, NULL); LGLYPH_SET_WIDTH (lglyph, metrics.width); LGLYPH_SET_LBEARING (lglyph, metrics.lbearing); LGLYPH_SET_RBEARING (lglyph, metrics.rbearing); diff --git a/src/lisp.h b/src/lisp.h index ad7d67ae69..c4ae954999 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -1103,6 +1103,7 @@ DEFINE_GDB_SYMBOL_END (PSEUDOVECTOR_FLAG) PVEC_MUTEX, PVEC_CONDVAR, PVEC_MODULE_FUNCTION, + PVEC_GLYPH_CONTEXT, /* These should be last, for internal_equal and sxhash_obj. */ PVEC_COMPILED, diff --git a/src/xdisp.c b/src/xdisp.c index cf15f579b5..41a7b4235a 100644 --- a/src/xdisp.c +++ b/src/xdisp.c @@ -27499,14 +27499,15 @@ append_glyph_string (struct glyph_string **head, struct glyph_string **tail, static struct face * get_char_face_and_encoding (struct frame *f, int c, int face_id, - unsigned *char2b, bool display_p) + unsigned *char2b, bool display_p, + struct glyph_context *context) { struct face *face = FACE_FROM_ID (f, face_id); unsigned code = 0; if (face->font) { - code = face->font->driver->encode_char (face->font, c); + code = face->font->driver->encode_char (face->font, c, context); if (code == FONT_INVALID_CODE) code = 0; @@ -27533,7 +27534,7 @@ get_char_face_and_encoding (struct frame *f, int c, int face_id, static struct face * get_glyph_face_and_encoding (struct frame *f, struct glyph *glyph, - unsigned *char2b) + unsigned *char2b, struct glyph_context *context) { struct face *face; unsigned code = 0; @@ -27549,7 +27550,8 @@ get_glyph_face_and_encoding (struct frame *f, struct glyph *glyph, if (CHAR_BYTE8_P (glyph->u.ch)) code = CHAR_TO_BYTE8 (glyph->u.ch); else - code = face->font->driver->encode_char (face->font, glyph->u.ch); + code = face->font->driver->encode_char (face->font, glyph->u.ch, + context); if (code == FONT_INVALID_CODE) code = 0; @@ -27565,14 +27567,15 @@ get_glyph_face_and_encoding (struct frame *f, struct glyph *glyph, Return true iff FONT has a glyph for C. */ static bool -get_char_glyph_code (int c, struct font *font, unsigned *char2b) +get_char_glyph_code (int c, struct font *font, unsigned *char2b, + struct glyph_context *context) { unsigned code; if (CHAR_BYTE8_P (c)) code = CHAR_TO_BYTE8 (c); else - code = font->driver->encode_char (font, c); + code = font->driver->encode_char (font, c, context); if (code == FONT_INVALID_CODE) return false; @@ -27620,7 +27623,8 @@ fill_composite_glyph_string (struct glyph_string *s, struct face *base_face, -1, Qnil); face = get_char_face_and_encoding (s->f, c, face_id, - s->char2b + i, true); + s->char2b + i, true, + NULL); if (face) { if (! s->face) @@ -27777,12 +27781,13 @@ fill_glyph_string (struct glyph_string *s, int face_id, && glyph->glyph_not_available_p == glyph_not_available_p) { s->face = get_glyph_face_and_encoding (s->f, glyph, - s->char2b + s->nchars); + s->char2b + s->nchars, + glyph->context); ++s->nchars; eassert (s->nchars <= end - start); s->width += glyph->pixel_width; - if (glyph++->padding_p != s->padding_p) - break; + glyph++; + break; } s->font = s->face->font; @@ -27877,7 +27882,8 @@ fill_stretch_glyph_string (struct glyph_string *s, int start, int end) } static struct font_metrics * -get_per_char_metric (struct font *font, const unsigned *char2b) +get_per_char_metric (struct font *font, const unsigned *char2b, + struct glyph_context *context) { static struct font_metrics metrics; @@ -27886,7 +27892,7 @@ get_per_char_metric (struct font *font, const unsigned *char2b) if (*char2b == FONT_INVALID_CODE) return NULL; - font->driver->text_extents (font, char2b, 1, &metrics); + font->driver->text_extents (font, char2b, 1, &metrics, context); return &metrics; } @@ -27908,9 +27914,10 @@ normal_char_ascent_descent (struct font *font, int c, int *ascent, int *descent) /* Get metrics of C, defaulting to a reasonably sized ASCII character. */ - if (get_char_glyph_code (c >= 0 ? c : '{', font, &char2b)) + if (get_char_glyph_code (c >= 0 ? c : '{', font, &char2b, NULL)) { - struct font_metrics *pcm = get_per_char_metric (font, &char2b); + struct font_metrics *pcm = get_per_char_metric (font, &char2b, + NULL); if (!(pcm->width == 0 && pcm->rbearing == 0 && pcm->lbearing == 0)) { @@ -27952,10 +27959,12 @@ gui_get_glyph_overhangs (struct glyph *glyph, struct frame *f, int *left, int *r if (glyph->type == CHAR_GLYPH) { unsigned char2b; - struct face *face = get_glyph_face_and_encoding (f, glyph, &char2b); + struct face *face = get_glyph_face_and_encoding (f, glyph, &char2b, + NULL); if (face->font) { - struct font_metrics *pcm = get_per_char_metric (face->font, &char2b); + struct font_metrics *pcm = get_per_char_metric (face->font, &char2b, + NULL); if (pcm) { if (pcm->rbearing > pcm->width) @@ -29841,12 +29850,12 @@ produce_glyphless_glyph (struct it *it, bool for_no_font, Lisp_Object acronym) str = buf; } for (len = 0; str[len] && ASCII_CHAR_P (str[len]) && len < 6; len++) - code[len] = font->driver->encode_char (font, str[len]); + code[len] = font->driver->encode_char (font, str[len], NULL); upper_len = (len + 1) / 2; font->driver->text_extents (font, code, upper_len, - &metrics_upper); + &metrics_upper, NULL); font->driver->text_extents (font, code + upper_len, len - upper_len, - &metrics_lower); + &metrics_lower, NULL); @@ -29936,6 +29945,40 @@ #define IT_APPLY_FACE_BOX(it, face) \ } \ } while (false) +static struct glyph_context * +make_context (struct it *it) +{ + struct glyph_context *context = xmalloc (sizeof *context); // XXX GC + char *string = xmalloc (128); + char *p = string; + ptrdiff_t bytepos = it->current.pos.bytepos; + ptrdiff_t charpos = it->current.pos.charpos; + ptrdiff_t bp5 = bytepos; + ptrdiff_t bp0 = bp5; + ptrdiff_t bp1 = bp5; + while (bytepos > BEG_BYTE && bp5 - bytepos < 32) + dec_both (&charpos, &bytepos); + bp0 = bytepos; + int i = 0; + Lisp_Object pos = make_fixnum (0); + while (bytepos >= BEG_BYTE && bytepos < Z_BYTE && bytepos - bp0 < 32) + { + inc_both (&charpos, &bytepos); + memcpy (p, BUF_BYTE_ADDRESS (current_buffer, bytepos - prev_char_len (bytepos)), prev_char_len (bytepos)); + p += prev_char_len (bytepos); + ++i; + if (bytepos == bp5) + pos = make_fixnum (i); + } + bp1 = bytepos; + eassert (strlen (p) == bp1 - bp0); + *p++ = it->c; + *p++ = 0; + context->string = build_string (string); + context->position = pos; + return context; +} + /* RIF: Produce glyphs/get display metrics for the display element IT is loaded with. See the description of struct it in dispextern.h @@ -29973,6 +30016,7 @@ gui_produce_glyphs (struct it *it) if (font->vertical_centering) boff = VCENTER_BASELINE_OFFSET (font, it->f) - boff; + struct glyph_context *context = NULL; if (it->char_to_display != '\n' && it->char_to_display != '\t') { it->nglyphs = 1; @@ -29989,9 +30033,11 @@ gui_produce_glyphs (struct it *it) it->descent = FONT_DESCENT (font) - boff; } - if (get_char_glyph_code (it->char_to_display, font, &char2b)) + context = make_context (it); + if (get_char_glyph_code (it->char_to_display, font, &char2b, + context)) { - pcm = get_per_char_metric (font, &char2b); + pcm = get_per_char_metric (font, &char2b, context); if (pcm->width == 0 && pcm->rbearing == 0 && pcm->lbearing == 0) pcm = NULL; @@ -30079,9 +30125,13 @@ gui_produce_glyphs (struct it *it) / FONT_HEIGHT (font)); append_stretch_glyph (it, it->object, it->pixel_width, it->ascent + it->descent, ascent); + it->glyph_row->glyphs[it->area][it->glyph_row->used[it->area] - 1].context = NULL; } else - append_glyph (it); + { + append_glyph (it); + it->glyph_row->glyphs[it->area][it->glyph_row->used[it->area] - 1].context = context; + } /* If characters with lbearing or rbearing are displayed in this line, record that fact in a flag of the @@ -30233,9 +30283,9 @@ gui_produce_glyphs (struct it *it) it->nglyphs = 1; if (FONT_TOO_HIGH (font)) { - if (get_char_glyph_code (' ', font, &char2b)) + if (get_char_glyph_code (' ', font, &char2b, NULL)) { - pcm = get_per_char_metric (font, &char2b); + pcm = get_per_char_metric (font, &char2b, NULL); if (pcm->width == 0 && pcm->rbearing == 0 && pcm->lbearing == 0) pcm = NULL; @@ -30372,8 +30422,8 @@ gui_produce_glyphs (struct it *it) if (! font_not_found_p) { get_char_face_and_encoding (it->f, c, it->face_id, - &char2b, false); - pcm = get_per_char_metric (font, &char2b); + &char2b, false, NULL); + pcm = get_per_char_metric (font, &char2b, NULL); } /* Initialize the bounding box. */ @@ -30433,8 +30483,9 @@ gui_produce_glyphs (struct it *it) else { get_char_face_and_encoding (it->f, ch, face_id, - &char2b, false); - pcm = get_per_char_metric (font, &char2b); + &char2b, false, + make_context (it)); + pcm = get_per_char_metric (font, &char2b, make_context (it)); } if (! pcm) cmp->offsets[i * 2] = cmp->offsets[i * 2 + 1] = 0; diff --git a/src/xterm.c b/src/xterm.c index 7989cecec7..3b5f0d3524 100644 --- a/src/xterm.c +++ b/src/xterm.c @@ -1703,7 +1703,8 @@ x_compute_glyph_string_overhangs (struct glyph_string *s) if (s->first_glyph->type == CHAR_GLYPH) { struct font *font = s->font; - font->driver->text_extents (font, s->char2b, s->nchars, &metrics); + font->driver->text_extents (font, s->char2b, s->nchars, &metrics, + NULL); } else { @@ -2047,7 +2048,7 @@ x_draw_glyphless_glyph_string_foreground (struct glyph_string *s) /* It is assured that all LEN characters in STR is ASCII. */ for (j = 0; j < len; j++) - char2b[j] = s->font->driver->encode_char (s->font, str[j]) & 0xFFFF; + char2b[j] = s->font->driver->encode_char (s->font, str[j], NULL) & 0xFFFF; s->font->driver->draw (s, 0, upper_len, x + glyph->slice.glyphless.upper_xoff, s->ybase + glyph->slice.glyphless.upper_yoff, ^ permalink raw reply related [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-19 21:43 ` Pip Cet @ 2020-05-20 1:41 ` Clément Pit-Claudel 2020-05-20 2:07 ` Ligatures Stefan Monnier ` (2 subsequent siblings) 3 siblings, 0 replies; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-20 1:41 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, Alan Third, emacs-devel On 19/05/2020 17.43, Pip Cet wrote: > And I'm afraid the difference is much more obvious with box cursors > than it is with carets. I'm attaching a screenshot of a patched Emacs > displaying "ffi", with point on the second f, in the "Linux Libertine > Display O" font (using approximately equal slices). Beauty is in the eye of the beholder :) This looks great to me, actually. Maybe I'm just used to it because it's consistent with what Firefox does when I select text, and I have a habit of randomly selecting text while I read? Thanks for working on this! ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 21:43 ` Pip Cet 2020-05-20 1:41 ` Clément Pit-Claudel @ 2020-05-20 2:07 ` Stefan Monnier 2020-05-20 7:14 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) tomas 2020-05-20 15:18 ` Eli Zaretskii 3 siblings, 0 replies; 145+ messages in thread From: Stefan Monnier @ 2020-05-20 2:07 UTC (permalink / raw) To: Pip Cet; +Cc: Clément Pit-Claudel, emacs-devel, Eli Zaretskii, Alan Third > than it is with carets. I'm attaching a screenshot of a patched Emacs > displaying "ffi", with point on the second f, in the "Linux Libertine > Display O" font (using approximately equal slices). This looks pretty good to me. Not perfect, but to the extent that the border of the drawn cursor go right through the "space" that separates the letters, it shows clearly where we are. > I think this is a bit of a worst-case scenario I hope you're right. Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-19 21:43 ` Pip Cet 2020-05-20 1:41 ` Clément Pit-Claudel 2020-05-20 2:07 ` Ligatures Stefan Monnier @ 2020-05-20 7:14 ` tomas 2020-05-20 15:18 ` Eli Zaretskii 3 siblings, 0 replies; 145+ messages in thread From: tomas @ 2020-05-20 7:14 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 591 bytes --] On Tue, May 19, 2020 at 09:43:49PM +0000, Pip Cet wrote: [...] > And I'm afraid the difference is much more obvious with box cursors > than it is with carets. I'm attaching a screenshot of a patched Emacs > displaying "ffi", with point on the second f, in the "Linux Libertine > Display O" font (using approximately equal slices). Nice. I understand what miffs you (the overhang falls off the cursor box, "compensated" by the wrong overhang entering from the left), but given the information available you just can't do better. IMHO it looks fine. Thanks for showing us :-) Cheers -- t [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-19 21:43 ` Pip Cet ` (2 preceding siblings ...) 2020-05-20 7:14 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) tomas @ 2020-05-20 15:18 ` Eli Zaretskii 2020-05-20 17:31 ` Clément Pit-Claudel 2020-05-21 10:01 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Pip Cet 3 siblings, 2 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-20 15:18 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Tue, 19 May 2020 21:43:49 +0000 > Cc: Eli Zaretskii <eliz@gnu.org>, Alan Third <alan@idiocy.org>, emacs-devel@gnu.org > > And I'm afraid the difference is much more obvious with box cursors > than it is with carets. I'm attaching a screenshot of a patched Emacs > displaying "ffi", with point on the second f, in the "Linux Libertine > Display O" font (using approximately equal slices). > > I think this is a bit of a worst-case scenario, a three-letter > ligature in a font using ligatures and overhangs very > enthusiastically. It might be okay for other fonts. I'm not sure this is the worst case. It might be the worst case if we are talking about ligatures that involve only ASCII characters, and don't involve symbols like ==> that gets converted to ⇒. But in general, there are worse cases, like á (two codepoints). And for kicks see the Khmer hello in etc/HELLO, where you can find 4 codepoints that produce a grapheme cluster made of 3 glyphs. If we only want this feature for ASCII ligatures, then it sounds like a limitation to me (and frankly, somewhat unclean as features go), but if we really want this only for these limited cases, we will need to somehow indicate to the display engine which ligatures are to be handled like this and which aren't. > My remaining idea is to stretch characters so we can break up a > ligature without changing its total width. I'm not sure how to do > that, though. I don't think I understand what you'd like to do. Can you elaborate? > (I'm also attaching the patch, for the morbidly curious; it isn't > clean, readable, or finished in any way, and contains at least one > obvious bug. It's just good enough to produce the screenshot, and > maybe it can serve as a hint as to which files need changing for > ligatures to work; but such changes would have to be done very > differently from the patch.). Right, the actual implementation will have to be different. In particular, I think that if ligatures will use automatic compositions, the information you need is already stored in the composition table and reachable from the glyph string, so you don't need to invoke the shaper again. I see you implemented this for static compositions, which are semi-obsolete. Also, I don't see the code which moves point inside the ligature; Emacs will not allow doing that by default. In particular, how did you tell the display code to show the cursor on the middle 'f', not on the first one? Did I miss something? And finally, you said you intended to do this via row->clip, but this patch does something very different. What changed your mind? Thanks. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-20 15:18 ` Eli Zaretskii @ 2020-05-20 17:31 ` Clément Pit-Claudel 2020-05-20 18:01 ` Eli Zaretskii 2020-05-20 23:19 ` Ligatures Stefan Monnier 2020-05-21 10:01 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Pip Cet 1 sibling, 2 replies; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-20 17:31 UTC (permalink / raw) To: Eli Zaretskii, Pip Cet; +Cc: alan, emacs-devel On 20/05/2020 11.18, Eli Zaretskii wrote: > It might be the worst case if we are talking about ligatures that > involve only ASCII characters, and don't involve symbols like ==> > that gets converted to ⇒. Wouldn't ==> be converted to ⟹ instead of ⇒? But regardless, what's the issue with ⇒? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-20 17:31 ` Clément Pit-Claudel @ 2020-05-20 18:01 ` Eli Zaretskii 2020-05-20 18:33 ` Clément Pit-Claudel 2020-05-20 23:19 ` Ligatures Stefan Monnier 1 sibling, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-20 18:01 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Wed, 20 May 2020 13:31:13 -0400 > > On 20/05/2020 11.18, Eli Zaretskii wrote: > > It might be the worst case if we are talking about ligatures that > > involve only ASCII characters, and don't involve symbols like ==> > > that gets converted to ⇒. > > Wouldn't ==> be converted to ⟹ instead of ⇒? Yes, to ⟹, sorry. > But regardless, what's the issue with ⇒? The issue with ⟹ is that the stem doesn't seem to be splittable into 2 parts, whereas "==" are two characters. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-20 18:01 ` Eli Zaretskii @ 2020-05-20 18:33 ` Clément Pit-Claudel 2020-05-20 18:49 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-20 18:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: alan, pipcet, emacs-devel On 20/05/2020 14.01, Eli Zaretskii wrote: >> But regardless, what's the issue with ⇒? > > The issue with ⟹ is that the stem doesn't seem to be splittable into 2 > parts, whereas "==" are two characters. Oh, I see the worry, but I don't think it's a problem — it's a feature to split the stem into two parts :) In a monospace font, it should look obvious what's happening, since ⟹ will occupy three columns. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-20 18:33 ` Clément Pit-Claudel @ 2020-05-20 18:49 ` Eli Zaretskii 2020-05-20 18:53 ` Clément Pit-Claudel 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-20 18:49 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Wed, 20 May 2020 14:33:24 -0400 > > On 20/05/2020 14.01, Eli Zaretskii wrote: > >> But regardless, what's the issue with ⇒? > > > > The issue with ⟹ is that the stem doesn't seem to be splittable into 2 > > parts, whereas "==" are two characters. > > Oh, I see the worry, but I don't think it's a problem — it's a feature to split the stem into two parts :) Then I guess we have very different views of what is a "feature". To me, this looks like a terrible kludge. > In a monospace font, it should look obvious what's happening, since ⟹ will occupy three columns. Here it occupies only two. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-20 18:49 ` Eli Zaretskii @ 2020-05-20 18:53 ` Clément Pit-Claudel 2020-05-20 19:02 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-20 18:53 UTC (permalink / raw) To: Eli Zaretskii; +Cc: alan, pipcet, emacs-devel On 20/05/2020 14.49, Eli Zaretskii wrote: >> Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Wed, 20 May 2020 14:33:24 -0400 >> >> On 20/05/2020 14.01, Eli Zaretskii wrote: >>>> But regardless, what's the issue with ⇒? >>> >>> The issue with ⟹ is that the stem doesn't seem to be splittable into 2 >>> parts, whereas "==" are two characters. >> >> Oh, I see the worry, but I don't think it's a problem — it's a feature to split the stem into two parts :) > > Then I guess we have very different views of what is a "feature". To > me, this looks like a terrible kludge. Yet, that's what everyone else is doing, so at least it's a predictable (and convenient) kludge. >> In a monospace font, it should look obvious what's happening, since ⟹ will occupy three columns. > > Here it occupies only two. Do you have a font with ligatures that composes ==> into ⟹, taking only two characters? Most of the monospace fonts on my machine show ⇒ as one character and ⟹ as two — but the ones that have ligatures changing => into ⇒ and ==> into ⟹ all respect the widths of the characters they compose, so ⇒ is two characters wide and ⟹ is three characters wide. I don't think the width of ⟹ as a non-composed character is too relevant, since we won't break it up, right? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-20 18:53 ` Clément Pit-Claudel @ 2020-05-20 19:02 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-20 19:02 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Wed, 20 May 2020 14:53:59 -0400 > > On 20/05/2020 14.49, Eli Zaretskii wrote: > >> Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> > >> Date: Wed, 20 May 2020 14:33:24 -0400 > >> > >> Oh, I see the worry, but I don't think it's a problem — it's a feature to split the stem into two parts :) > > > > Then I guess we have very different views of what is a "feature". To > > me, this looks like a terrible kludge. > > Yet, that's what everyone else is doing, so at least it's a predictable (and convenient) kludge. Since when we in Emacs do stuff "like everyone else" and feel good about that? Anyway, this argument about personal preferences is futile. Just understand that a feature that works for some vaguely-defined use cases, but doesn't work for the rest is a misfeature in my book. > I don't think the width of ⟹ as a non-composed character is too relevant, since we won't break it up, right? My point is that you cannot rely on the width being 3 columns. It may or may not be so. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-20 17:31 ` Clément Pit-Claudel 2020-05-20 18:01 ` Eli Zaretskii @ 2020-05-20 23:19 ` Stefan Monnier 1 sibling, 0 replies; 145+ messages in thread From: Stefan Monnier @ 2020-05-20 23:19 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: Eli Zaretskii, alan, Pip Cet, emacs-devel >> It might be the worst case if we are talking about ligatures that >> involve only ASCII characters, and don't involve symbols like ==> >> that gets converted to ⇒. > Wouldn't ==> be converted to ⟹ instead of ⇒? But regardless, what's the issue with ⇒? Using `misc-fixed` here, those two above are displayed identically (as single-column char) ;-) Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-20 15:18 ` Eli Zaretskii 2020-05-20 17:31 ` Clément Pit-Claudel @ 2020-05-21 10:01 ` Pip Cet 2020-05-21 14:11 ` Eli Zaretskii 1 sibling, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-21 10:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel Hi, Eli, On Wed, May 20, 2020 at 3:31 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Tue, 19 May 2020 21:43:49 +0000 > > Cc: Eli Zaretskii <eliz@gnu.org>, Alan Third <alan@idiocy.org>, emacs-devel@gnu.org > > > > And I'm afraid the difference is much more obvious with box cursors > > than it is with carets. I'm attaching a screenshot of a patched Emacs > > displaying "ffi", with point on the second f, in the "Linux Libertine > > Display O" font (using approximately equal slices). > > > > I think this is a bit of a worst-case scenario, a three-letter > > ligature in a font using ligatures and overhangs very > > enthusiastically. It might be okay for other fonts. > > I'm not sure this is the worst case. It might be the worst case if we > are talking about ligatures that involve only ASCII characters, and > don't involve symbols like ==> that gets converted to ⇒. But in > general, there are worse cases, like á (two codepoints). And for > kicks see the Khmer hello in etc/HELLO, where you can find 4 > codepoints that produce a grapheme cluster made of 3 glyphs. You're correct: I'm simply not dealing with Khmer or composed characters (which are different from ligatures, of course) in the patch, and I'm not certain how to deal with them in theory, either. > If we only want this feature for ASCII ligatures, then it sounds like > a limitation to me (and frankly, somewhat unclean as features go), Not "only for ASCII ligatures", but not "any conceivable combination of codepoints into glyphs" either. Just those supported by the font and Harfbuzz. > but > if we really want this only for these limited cases, we will need to > somehow indicate to the display engine which ligatures are to be > handled like this and which aren't. Well, we now know that fonts can provide information about how a ligature is to be split into one-dimensional slices; I filed a pull request against Harfbuzz (since merged) that would actually make the corresponding API work, at least for the "Libertinus" font family. Of course that means that Emacs behavior would depend on the font tables in ways it currently doesn't. That's a problem. > > My remaining idea is to stretch characters so we can break up a > > ligature without changing its total width. I'm not sure how to do > > that, though. > > I don't think I understand what you'd like to do. Can you elaborate? My idea was to display "ffi" with the point on the second f by condensing an "f" glyph to cover the middle third of the "ffi" glyph. However, I might have been too critical of how good the simple solution deals with this case. > > (I'm also attaching the patch, for the morbidly curious; it isn't > > clean, readable, or finished in any way, and contains at least one > > obvious bug. It's just good enough to produce the screenshot, and > > maybe it can serve as a hint as to which files need changing for > > ligatures to work; but such changes would have to be done very > > differently from the patch.). > > Right, the actual implementation will have to be different. In > particular, I think that if ligatures will use automatic compositions, > the information you need is already stored in the composition table > and reachable from the glyph string, so you don't need to invoke the > shaper again. Well, I'm sorry to bring up a different (though somewhat related issue), but kerning is also an issue: we need a shaper to get that right, not just a composition table, right? > I see you implemented this for static compositions, which are > semi-obsolete. I'm sorry, I'm afraid I don't understand. This should handle any composition the shaper does, and only those, but slices up everything horizontally by default. > Also, I don't see the code which moves point inside > the ligature; Emacs will not allow doing that by default. In > particular, how did you tell the display code to show the cursor on > the middle 'f', not on the first one? Did I miss something? I produce three "struct glyph"s for "ffi": each has width one third of the actual font glyph, and stores, in convoluted form, information about which slice of the font glyph is to be actually drawn. > And finally, you said you intended to do this via row->clip, but this > patch does something very different. What changed your mind? I was surprised this no longer seemed to be strictly necessary: as far as the display code is concerned, we're dealing with three separate glyphs with overhang areas, and those are already handled by the cursor-drawing code. Clipping is still needed: to deal with double-drawing issues, and to deal with such crimes as making part of a ligature have a different foreground color. I'm sorry it's not particularly obvious from the patch, but the approach I took yesterday is this: 1. every struct glyph has a "context", which specifies the character for the struct glyph and some surrounding text. 2. every struct glyph is converted to a slice of (currently) a single font glyph, by sending the context through the shaper and cutting out the relevant bits 3. struct glyphs are displayed one by one Problems: 1. ligatures can cross line boundaries 2. the context has to be updated, and trigger redisplay of the struct glyph 3. clipping is necessary 4. there are N clipped drawing operations for a single glyph covering N struct glyphs. 5. corner cases can have ambiguous context: for example, a string of many "f"s would be paired into "ff" glyphs, and simply cutting off the context after a certain number of characters might result in the wrong pairing On the other hand, it deals with kerning as well as ligatures. And other problems (right now, we call the shaper on 64 characters for every character we actually display, which makes things noticeably slow) are fixable. Overall, I'd like to think more about alternative approaches to the "context string" one before implementing anything. How would that work for kerning, in particular? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-21 10:01 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Pip Cet @ 2020-05-21 14:11 ` Eli Zaretskii 2020-05-21 16:26 ` Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-21 14:11 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Thu, 21 May 2020 10:01:03 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > If we only want this feature for ASCII ligatures, then it sounds like > > a limitation to me (and frankly, somewhat unclean as features go), > > Not "only for ASCII ligatures", but not "any conceivable combination > of codepoints into glyphs" either. Just those supported by the font > and Harfbuzz. > > > but > > if we really want this only for these limited cases, we will need to > > somehow indicate to the display engine which ligatures are to be > > handled like this and which aren't. > > Well, we now know that fonts can provide information about how a > ligature is to be split into one-dimensional slices; The question is: do we want to show those carets for all the character compositions, even if the information is provided? If not, we will have to indicate somehow whether they should or shouldn't be shown for each particular grapheme cluster. > Of course that means that Emacs behavior would depend on the font > tables in ways it currently doesn't. That's a problem. It isn't a problem to depend on that if most fonts provide this information. Then we could simply say this is not supported when the information is not in the font. But if many fonts that support ligatures don't provide this information, we will need to have some fallback, like assume that every codepoint has the same share of the ligature's width. the fact that other applications use a simplistic heuristic and not the information in the fonts suggests that either the information is not readily available or there are some other problems with using it. > > Right, the actual implementation will have to be different. In > > particular, I think that if ligatures will use automatic compositions, > > the information you need is already stored in the composition table > > and reachable from the glyph string, so you don't need to invoke the > > shaper again. > > Well, I'm sorry to bring up a different (though somewhat related > issue), but kerning is also an issue: we need a shaper to get that > right, not just a composition table, right? Automatic compositions already use the shaper, see autocmp_chars. > > I see you implemented this for static compositions, which are > > semi-obsolete. > > I'm sorry, I'm afraid I don't understand. This should handle any > composition the shaper does, and only those, but slices up everything > horizontally by default. I'm talking about the changes in gui_produce_glyphs. Its high-level structure is basically if (it->what == IT_CHARACTER) { ... /* handles character glyphs */ } else if (it->what == IT_COMPOSITION && it->cmp_it.ch < 0) { ... /* A static compositions. */ } else if (it->what == IT_COMPOSITION) { /* A dynamic (automatic) composition. */ } [...] You made changes only in the "static compositions" part. That code handles compositions created by compose-region. The "modern" way of composing text in Emacs uses automatic compositions, which are controlled by data in composition-function-table. This is where we call the shaping engine to produce the glyphs according to rules stored in the font. I don't see in your patch any changes that affect ligatures created by automatic compositions; did I miss something? If you use the automatic compositions route, then the information you need, i.e. the number of clusters in the shaped text and the overall width of the ligature, is already produced by the shaper and stored in the "gstring" object in the composition table, see the description of that object in the doc string of composition-get-gstring. So there should be no need to invoke the shaper inside gui_produce_glyphs and elsewhere. (If we want to use the carets information from the font, we will probably need to extend the gstring object to store that as well, and extend the shape method to extract this information when available.) > > Also, I don't see the code which moves point inside > > the ligature; Emacs will not allow doing that by default. In > > particular, how did you tell the display code to show the cursor on > > the middle 'f', not on the first one? Did I miss something? > > I produce three "struct glyph"s for "ffi": each has width one third of > the actual font glyph, and stores, in convoluted form, information > about which slice of the font glyph is to be actually drawn. Ah, okay, I missed that. But producing 3 glyphs instead of just one is not necessarily the best idea, I think. As you point out, one problem will be with splitting the ligature across lines. Another problem is more expensive display. And we won't be able to display the ligature as a single glyph, for those who want that, at least not easily. > > And finally, you said you intended to do this via row->clip, but this > > patch does something very different. What changed your mind? > > I was surprised this no longer seemed to be strictly necessary: as far > as the display code is concerned, we're dealing with three separate > glyphs with overhang areas, and those are already handled by the > cursor-drawing code. Yes. But if we return to a single glyph, then we'd need to do some clipping. > On the other hand, it deals with kerning as well as ligatures. You mean, kerning of simple characters, for which we don't produce ligatures? Or kerning within ligatures? If the latter, then I don't see why we'd need that: font designers already design the ligatures to have the optimal kerning, no? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-21 14:11 ` Eli Zaretskii @ 2020-05-21 16:26 ` Pip Cet 2020-05-21 19:08 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-21 16:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Thu, May 21, 2020 at 2:11 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Thu, 21 May 2020 10:01:03 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > but > > > if we really want this only for these limited cases, we will need to > > > somehow indicate to the display engine which ligatures are to be > > > handled like this and which aren't. > > > > Well, we now know that fonts can provide information about how a > > ligature is to be split into one-dimensional slices; > > The question is: do we want to show those carets for all the character > compositions, even if the information is provided? If not, we will > have to indicate somehow whether they should or shouldn't be shown for > each particular grapheme cluster. Oh. I hadn't thought about fonts providing such caret information in cases where they shouldn't, but of course that's a valid concern. > > Of course that means that Emacs behavior would depend on the font > > tables in ways it currently doesn't. That's a problem. > > It isn't a problem to depend on that if most fonts provide this > information. > Then we could simply say this is not supported when the > information is not in the font. I'm not sure how simple that would be: we could treat ligatures without carets as atomic, or we could tell harfbuzz not to apply ligatures without carets, or maybe make that decision depend on whether the ligature is required or discretionary... > But if many fonts that support > ligatures don't provide this information, we will need to have some > fallback, like assume that every codepoint has the same share of the > ligature's width. the fact that other applications use a simplistic > heuristic and not the information in the fonts suggests that either > the information is not readily available or there are some other > problems with using it. Correct, it does. I'm not sure which one is the case. > > > Right, the actual implementation will have to be different. In > > > particular, I think that if ligatures will use automatic compositions, > > > the information you need is already stored in the composition table > > > and reachable from the glyph string, so you don't need to invoke the > > > shaper again. > > > > Well, I'm sorry to bring up a different (though somewhat related > > issue), but kerning is also an issue: we need a shaper to get that > > right, not just a composition table, right? > > Automatic compositions already use the shaper, see autocmp_chars. I'm not sure I understand how kerning would work using automatic compositions. > > > I see you implemented this for static compositions, which are > > > semi-obsolete. > > > > I'm sorry, I'm afraid I don't understand. This should handle any > > composition the shaper does, and only those, but slices up everything > > horizontally by default. > > I'm talking about the changes in gui_produce_glyphs. Its high-level > structure is basically > > if (it->what == IT_CHARACTER) > { > ... /* handles character glyphs */ > } > else if (it->what == IT_COMPOSITION && it->cmp_it.ch < 0) > { > ... /* A static compositions. */ > } > else if (it->what == IT_COMPOSITION) > { > /* A dynamic (automatic) composition. */ > } > [...] > > You made changes only in the "static compositions" part. No. I didn't touch the "static compositions" part at all, except for passing an extra NULL pointer to an API I'd extended. (At least, that's what I intended, for all the changes to be in the IT_CHARACTER part). > That code > handles compositions created by compose-region. The "modern" way of > composing text in Emacs uses automatic compositions, which are > controlled by data in composition-function-table. This is where we > call the shaping engine to produce the glyphs according to rules > stored in the font. I don't see in your patch any changes that affect > ligatures created by automatic compositions; did I miss something? I don't think so; I went for a third route, that of leaving all compositions handling to the shaper and doing none of it in Emacs itself. > If you use the automatic compositions route, then the information you > need, i.e. the number of clusters in the shaped text and the overall > width of the ligature, is already produced by the shaper and stored in > the "gstring" object in the composition table, see the description of > that object in the doc string of composition-get-gstring. So there > should be no need to invoke the shaper inside gui_produce_glyphs and > elsewhere. (If we want to use the carets information from the font, > we will probably need to extend the gstring object to store that as > well, and extend the shape method to extract this information when > available.) Yes, and that seemed too complicated for me for something that I thought wouldn't handle kerning anyway... > > > Also, I don't see the code which moves point inside > > > the ligature; Emacs will not allow doing that by default. In > > > particular, how did you tell the display code to show the cursor on > > > the middle 'f', not on the first one? Did I miss something? > > > > I produce three "struct glyph"s for "ffi": each has width one third of > > the actual font glyph, and stores, in convoluted form, information > > about which slice of the font glyph is to be actually drawn. > > Ah, okay, I missed that. But producing 3 glyphs instead of just one > is not necessarily the best idea, I think. I agree! I'd be happy to hear better ideas, and I think for now "use fixed-width fonts" is a better idea... > As you point out, one > problem will be with splitting the ligature across lines. Another > problem is more expensive display. You mean the actual "copy the glyph bitmap to the glass" display? Because I don't think that's relevant. Overall redisplay() time really goes up calling the shaper on 32 characters for every character displayed, though, so that's a concern I agree with. > And we won't be able to display > the ligature as a single glyph, for those who want that, at least not > easily. But that's what they can do now, with the IT_COMPOSITION case, right? Because I did not touch that code so I didn't expect that to break (famous last words). > > > And finally, you said you intended to do this via row->clip, but this > > > patch does something very different. What changed your mind? > > > > I was surprised this no longer seemed to be strictly necessary: as far > > as the display code is concerned, we're dealing with three separate > > glyphs with overhang areas, and those are already handled by the > > cursor-drawing code. > > Yes. But if we return to a single glyph, then we'd need to do some > clipping. As I said, we need to do the clipping to render antialiased pixels properly. It's just two lines of code in ftcrfont_draw: cairo_rectangle (cr, x, y - FONT_BASE (face->font), s->width, FONT_HEIGHT (face->font)); cairo_clip (cr); > > On the other hand, it deals with kerning as well as ligatures. > > You mean, kerning of simple characters, for which we don't produce > ligatures? Yes, that's what I mean. > Or kerning within ligatures? If the latter, then I don't > see why we'd need that: font designers already design the ligatures to > have the optimal kerning, no? It's certainly not our job to fix that if they don't! Perhaps I can digress a little and describe what I think the interaction with the shaper should be like: Emacs: I'd like to display codepoint 'f' Harfbuzz: you'll have to tell me the codepoint before that Emacs: 'f' Harfbuzz: and the one after those two Emacs: 'i' Harfbuzz: and the one before all of those Emacs: That's too expensive for me to compute / it's the beginning of paragraph / a bidi boundary / an object without an assigned codepoint / ... Harfbuzz: okay, display it as the middle slice of the "ffi" glyph I.e., I'd like Harfbuzz to be asynchronous, and request more information, parsimoniously, about the context of the codepoint we're describing, rather than working in one go from "complete" information to an indefinitely-long line of glyphs. And deal well with us deciding it's too expensive to perform that much look-back/look-ahead. (Because in real life, ligatures depend on knowing some amount of the context, but not all of it, or people could never start writing.) Of course, all this doesn't change that the "struct it" design is somewhat difficult to extend to handling look-ahead: it's easy enough to create a copy of the iterator and advance that while leaving the actual iterator intact, but it's also really slow. In fact I suspect the best way would be to make struct it a heap-allocated pseudovector (not necessarily one ordinarily garbage-collected, though), and cache "future" iterator states once we compute them. You're correct when you say that some major redesign is needed in this area, but I don't think that's the subject of the current discussion. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-21 16:26 ` Pip Cet @ 2020-05-21 19:08 ` Eli Zaretskii 2020-05-21 20:51 ` Clément Pit-Claudel 2020-05-21 21:06 ` Pip Cet 0 siblings, 2 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-21 19:08 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Thu, 21 May 2020 16:26:13 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > On Thu, May 21, 2020 at 2:11 PM Eli Zaretskii <eliz@gnu.org> wrote: > > > From: Pip Cet <pipcet@gmail.com> > > > Date: Thu, 21 May 2020 10:01:03 +0000 > > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > but > > > > if we really want this only for these limited cases, we will need to > > > > somehow indicate to the display engine which ligatures are to be > > > > handled like this and which aren't. > > > > > > Well, we now know that fonts can provide information about how a > > > ligature is to be split into one-dimensional slices; > > > > The question is: do we want to show those carets for all the character > > compositions, even if the information is provided? If not, we will > > have to indicate somehow whether they should or shouldn't be shown for > > each particular grapheme cluster. > > Oh. I hadn't thought about fonts providing such caret information in > cases where they shouldn't, but of course that's a valid concern. > > > > Of course that means that Emacs behavior would depend on the font > > > tables in ways it currently doesn't. That's a problem. > > > > It isn't a problem to depend on that if most fonts provide this > > information. > > > Then we could simply say this is not supported when the > > information is not in the font. > > I'm not sure how simple that would be: we could treat ligatures > without carets as atomic, or we could tell harfbuzz not to apply > ligatures without carets, or maybe make that decision depend on > whether the ligature is required or discretionary... > > > But if many fonts that support > > ligatures don't provide this information, we will need to have some > > fallback, like assume that every codepoint has the same share of the > > ligature's width. the fact that other applications use a simplistic > > heuristic and not the information in the fonts suggests that either > > the information is not readily available or there are some other > > problems with using it. > > Correct, it does. I'm not sure which one is the case. > > > > > Right, the actual implementation will have to be different. In > > > > particular, I think that if ligatures will use automatic compositions, > > > > the information you need is already stored in the composition table > > > > and reachable from the glyph string, so you don't need to invoke the > > > > shaper again. > > > > > > Well, I'm sorry to bring up a different (though somewhat related > > > issue), but kerning is also an issue: we need a shaper to get that > > > right, not just a composition table, right? > > > > Automatic compositions already use the shaper, see autocmp_chars. > > I'm not sure I understand how kerning would work using automatic compositions. > > > > > I see you implemented this for static compositions, which are > > > > semi-obsolete. > > > > > > I'm sorry, I'm afraid I don't understand. This should handle any > > > composition the shaper does, and only those, but slices up everything > > > horizontally by default. > > > > I'm talking about the changes in gui_produce_glyphs. Its high-level > > structure is basically > > > > if (it->what == IT_CHARACTER) > > { > > ... /* handles character glyphs */ > > } > > else if (it->what == IT_COMPOSITION && it->cmp_it.ch < 0) > > { > > ... /* A static compositions. */ > > } > > else if (it->what == IT_COMPOSITION) > > { > > /* A dynamic (automatic) composition. */ > > } > > [...] > > > > You made changes only in the "static compositions" part. > > No. I didn't touch the "static compositions" part at all, except for > passing an extra NULL pointer to an API I'd extended. (At least, > that's what I intended, for all the changes to be in the IT_CHARACTER > part). I mean this part: @@ -30433,8 +30483,9 @@ gui_produce_glyphs (struct it *it) else { get_char_face_and_encoding (it->f, ch, face_id, - &char2b, false); - pcm = get_per_char_metric (font, &char2b); + &char2b, false, + make_context (it)); + pcm = get_per_char_metric (font, &char2b, make_context (it)); } This calls make_context and passes it to these functions. This code handles static compositions only. > > The "modern" way of composing text in Emacs uses automatic > > compositions, which are controlled by data in > > composition-function-table. This is where we call the shaping > > engine to produce the glyphs according to rules stored in the > > font. I don't see in your patch any changes that affect ligatures > > created by automatic compositions; did I miss something? > > I don't think so; I went for a third route, that of leaving all > compositions handling to the shaper and doing none of it in Emacs > itself. But automatic compositions do work by calling the shaper. > Perhaps I can digress a little and describe what I think the > interaction with the shaper should be like: > > Emacs: I'd like to display codepoint 'f' > Harfbuzz: you'll have to tell me the codepoint before that > Emacs: 'f' > Harfbuzz: and the one after those two > Emacs: 'i' > Harfbuzz: and the one before all of those > Emacs: That's too expensive for me to compute / it's the beginning of > paragraph / a bidi boundary / an object without an assigned codepoint > / ... > Harfbuzz: okay, display it as the middle slice of the "ffi" glyph > > I.e., I'd like Harfbuzz to be asynchronous, and request more > information, parsimoniously, about the context of the codepoint we're > describing, rather than working in one go from "complete" information > to an indefinitely-long line of glyphs. And deal well with us deciding > it's too expensive to perform that much look-back/look-ahead. (Because > in real life, ligatures depend on knowing some amount of the context, > but not all of it, or people could never start writing.) That would prevent Emacs from controlling what is and what isn't composed, leaving the shaper in charge. We currently allow Lisp to control that via composition-function-table, which provides a regexp that text around a character must match in order for the matching substring to be passed to the shaper. We never call the shaper unless composition-function-table tells us to do so. I'm not sure I understand what problems do you see with this design. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-21 19:08 ` Eli Zaretskii @ 2020-05-21 20:51 ` Clément Pit-Claudel 2020-05-21 21:16 ` Pip Cet 2020-05-22 11:44 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii 2020-05-21 21:06 ` Pip Cet 1 sibling, 2 replies; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-21 20:51 UTC (permalink / raw) To: Eli Zaretskii, Pip Cet; +Cc: alan, emacs-devel On 21/05/2020 15.08, Eli Zaretskii wrote: > That would prevent Emacs from controlling what is and what isn't > composed, leaving the shaper in charge. We currently allow Lisp to > control that via composition-function-table, which provides a regexp > that text around a character must match in order for the matching > substring to be passed to the shaper. We never call the shaper unless > composition-function-table tells us to do so. Does this mean that for each font we need to re-encode the font's logic for deciding whether to use a ligature? Some concrete examples: in Iosevka (*, (**, (***, (**** etc are all displayed with the * character vertically centered relative to the (, but a lone * is not centered. In Fira Code, punctuation is context-aware, so the "+" in "A + B" is not the same as the "+" in "a + b". In both of these faces, arrows can be of any length, and in Fira Code you can even mix and match them (see https://raw.githubusercontent.com/tonsky/FiraCode/master/extras/arrows.png). The documentation of Fira Code does recommend composition-function-table here: https://github.com/tonsky/FiraCode/wiki/Emacs-instructions, but it seems like a lot of extra work for each font, isn't it? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-21 20:51 ` Clément Pit-Claudel @ 2020-05-21 21:16 ` Pip Cet 2020-05-22 6:12 ` Eli Zaretskii 2020-05-22 11:44 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii 1 sibling, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-21 21:16 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: Eli Zaretskii, alan, emacs-devel On Thu, May 21, 2020 at 8:51 PM Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: > On 21/05/2020 15.08, Eli Zaretskii wrote: > > That would prevent Emacs from controlling what is and what isn't > > composed, leaving the shaper in charge. We currently allow Lisp to > > control that via composition-function-table, which provides a regexp > > that text around a character must match in order for the matching > > substring to be passed to the shaper. We never call the shaper unless > > composition-function-table tells us to do so. > > Does this mean that for each font we need to re-encode the font's logic for deciding whether to use a ligature? I think (set-char-table-range composition-function-table t '([".+" 0 font-shape-gstring])) should work, but it has weird side effects that I'm pretty sure aren't intended (paren highlighting is broken, for example). Is that supposed to happen? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-21 21:16 ` Pip Cet @ 2020-05-22 6:12 ` Eli Zaretskii 2020-05-22 9:25 ` Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 6:12 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Thu, 21 May 2020 21:16:44 +0000 > Cc: Eli Zaretskii <eliz@gnu.org>, alan@idiocy.org, emacs-devel@gnu.org > > (set-char-table-range composition-function-table t '([".+" 0 > font-shape-gstring])) > > should work, but it has weird side effects that I'm pretty sure aren't > intended (paren highlighting is broken, for example). This is not the right way. The right way is to do the likes of the following: (set-char-table-range composition-function-table '(?f . ?f) (list (vector "ffi" 0 'compose-gstring-for-graphic))) This shows how to do this only for the "ffi" ligature, but I think it makes the idea clear. Tassilo posted here some code ho wrote that supports more (and different) ligatures which are supposed to be used like prettify-symbols-mode. The idea is to populate composition-function-table only for characters that should trigger ligation. Whether to use compose-gstring-for-graphic or font-shape-gstring depends on what you want to happen when the font doesn't have a glyph for a certain ligature: the latter will then cause the characters be displayed as usual, as separate characters, the latter will display them as a single display element, a kind of "fake ligature". ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 6:12 ` Eli Zaretskii @ 2020-05-22 9:25 ` Pip Cet 2020-05-22 11:23 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-22 9:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Fri, May 22, 2020 at 6:12 AM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Thu, 21 May 2020 21:16:44 +0000 > > Cc: Eli Zaretskii <eliz@gnu.org>, alan@idiocy.org, emacs-devel@gnu.org > > > > (set-char-table-range composition-function-table t '([".+" 0 > > font-shape-gstring])) > > > > should work, but it has weird side effects that I'm pretty sure aren't > > intended (paren highlighting is broken, for example). > > This is not the right way. What is the right way, then? I want all ligatures my font supports. Also, even if it is the wrong thing to do, why does it break seemingly unrelated things? > The right way is to do the likes of the > following: > > (set-char-table-range > composition-function-table '(?f . ?f) > (list (vector "ffi" 0 'compose-gstring-for-graphic))) > This shows how to do this only for the "ffi" ligature, but I think it > makes the idea clear. I'm afraid it doesn't, to me. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 9:25 ` Pip Cet @ 2020-05-22 11:23 ` Eli Zaretskii 2020-05-22 12:52 ` Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 11:23 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Fri, 22 May 2020 09:25:31 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > should work, but it has weird side effects that I'm pretty sure aren't > > > intended (paren highlighting is broken, for example). > > > > This is not the right way. > > What is the right way, then? I want all ligatures my font supports. You can request all the ligatures that _can_ be supported; those which aren't available in the font you use will not be ligated (if you use font-shape-gstring in the composition-function-table slot). Or you can request only those ligatures that make sense for the particular use case. For example, when displaying program source code you'd probably want the various symbols, like -> etc., to produce ligatures, but you most probably won't want "ffi" in a variable name to produce a ligature. Or you can provide your own function to use in the composition-function-table, and that function can do more complex stuff, like refuse to ligate under some complicated conditions. Therefore, I think letting Lisp programs (and thus users) control what gets composed into ligatures and what doesn't is an important feature to have. We should develop it more, because currently it lacks some features we'd need for better ligature support (see the TODO item about that), but I think the basic design is valid. At least I didn't yet see any evidence that it isn't valid; perhaps when we develop it more and/or start using it more, we will find some problems, but I don't see them yet. > Also, even if it is the wrong thing to do, why does it break seemingly > unrelated things? I don't know. Can you show how to reproduce that in the current codebase on master? Then I'll look into it. > > (set-char-table-range > > composition-function-table '(?f . ?f) > > (list (vector "ffi" 0 'compose-gstring-for-graphic))) > > > This shows how to do this only for the "ffi" ligature, but I think it > > makes the idea clear. > > I'm afraid it doesn't, to me. Doesn't make the idea clear or doesn't produce the ligature? If the latter, then I'm puzzled, because it did work for me with a font that has the ffi ligature. If the former, please ask more questions and I will try to explain as best I can. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 11:23 ` Eli Zaretskii @ 2020-05-22 12:52 ` Pip Cet 2020-05-22 13:15 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-22 12:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Fri, May 22, 2020 at 11:23 AM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Fri, 22 May 2020 09:25:31 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > > > should work, but it has weird side effects that I'm pretty sure aren't > > > > intended (paren highlighting is broken, for example). > > > > > > This is not the right way. > > > > What is the right way, then? I want all ligatures my font supports. > > You can request all the ligatures that _can_ be supported; How do I do that? Opentype fonts can support arbitrary ligatures, such as "Zapfino" being a seven-letter ligature. > those which > aren't available in the font you use will not be ligated (if you use > font-shape-gstring in the composition-function-table slot). > Or you can request only those ligatures that make sense for the > particular use case. My use case is English text, and all ligatures supported by the font make sense for that. > For example, when displaying program source code > you'd probably want the various symbols, like -> etc., to produce > ligatures, but you most probably won't want "ffi" in a variable name > to produce a ligature. Why not? > Or you can provide your own function to use in the > composition-function-table, and that function can do more complex > stuff, like refuse to ligate under some complicated conditions. If that kind of thing turns out to be necessary, we can find ways of doing it, such as setting a text property with harfbuzz feature strings to be applied when rendering. > Therefore, I think letting Lisp programs (and thus users) control what > gets composed into ligatures and what doesn't is an important feature > to have. Okay, I can accept that requirement. But it should be possible to get "all ligatures", rather than a finite set you know about in advance. > We should develop it more, because currently it lacks some > features we'd need for better ligature support (see the TODO item > about that), but I think the basic design is valid. The TODO item is confusing and, I believe, confused. "For the list of typographical ligatures, see https://en.wikipedia.org/wiki/Orthographic_ligature#Ligatures_in_Unicode_(Latin_alphabets)" That's very wrong: typographical ligatures generally aren't assigned Unicode codepoints; those that have them usually do so for historical reasons. There's no finite "the" list of typographical ligatures, it's up to each font to define glyphs covering codepoint clusters as it sees fit. I disagree with pretty much every statement in the rest of the TODO item. > At least I didn't yet see any evidence that it isn't valid; But how do I make it work? For English/Western text with ligatures that I don't know about in advance? Please treat this as a dumb end-user question. What lines of Lisp do I enter to get all the ligatures my font supports, most of which do not have individual Unicode codepoints? > > Also, even if it is the wrong thing to do, why does it break seemingly > > unrelated things? > > I don't know. Can you show how to reproduce that in the current > codebase on master? Then I'll look into it. bug#41454 > > > (set-char-table-range > > > composition-function-table '(?f . ?f) > > > (list (vector "ffi" 0 'compose-gstring-for-graphic))) > > > > > This shows how to do this only for the "ffi" ligature, but I think it > > > makes the idea clear. > > > > I'm afraid it doesn't, to me. > > Doesn't make the idea clear or doesn't produce the ligature? It doesn't make the idea clear, because I simply see no practical way we're going to know about the ligatures the font provides in advance. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 12:52 ` Pip Cet @ 2020-05-22 13:15 ` Eli Zaretskii 2020-05-22 13:29 ` Clément Pit-Claudel 2020-05-22 13:56 ` Pip Cet 0 siblings, 2 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 13:15 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Fri, 22 May 2020 12:52:41 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > You can request all the ligatures that _can_ be supported; > > How do I do that? Opentype fonts can support arbitrary ligatures, such > as "Zapfino" being a seven-letter ligature. I thought the set of all the ligatures is known, and guided by typography experts. Do font designers really support ligatures from any arbitrary combination of characters? If so, where can I read about this? > > Or you can request only those ligatures that make sense for the > > particular use case. > > My use case is English text, and all ligatures supported by the font > make sense for that. Which ones are those? Is there an exhaustive list of such ligatures somewhere? > > For example, when displaying program source code > > you'd probably want the various symbols, like -> etc., to produce > > ligatures, but you most probably won't want "ffi" in a variable name > > to produce a ligature. > > Why not? It makes no sense to me. Why ligate them in that use case? Program source code isn't supposed to behave like typeset human-readable text. > Okay, I can accept that requirement. But it should be possible to get > "all ligatures", rather than a finite set you know about in advance. Let's first reach an understanding of what "all ligatures" actually means. I thought the full list of all ligatures is known in advanced and quite small, but maybe this is wrong, see above. > "For the list of typographical ligatures, see > > https://en.wikipedia.org/wiki/Orthographic_ligature#Ligatures_in_Unicode_(Latin_alphabets)" > > That's very wrong: typographical ligatures generally aren't assigned > Unicode codepoints; those that have them usually do so for historical > reasons. Indeed, ligatures don't have to have Unicode codepoints, only some of them are precomposed. Emacs doesn't need them to have codepoints when we use auto-composition-mode. The reference is there only to show the list of ligatures, and I believe the list is full regardless of the codepoint issue. Can you point me to a larger list of ligatures made out of ASCII letters? > There's no finite "the" list of typographical ligatures, it's up to > each font to define glyphs covering codepoint clusters as it sees > fit. Really? Any reference for this? > > At least I didn't yet see any evidence that it isn't valid; > > But how do I make it work? For English/Western text with ligatures > that I don't know about in advance? Please treat this as a dumb > end-user question. What lines of Lisp do I enter to get all the > ligatures my font supports, most of which do not have individual > Unicode codepoints? You tell Emacs that a given series of characters should be composed, via composition-function-table, and the shaper then does the job of providing the font glyphs for displaying that sequence. But I don't think we should continue with these details before we have a clear idea of whether the list of possible ligatures is really infinite, as you seem to imply. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 13:15 ` Eli Zaretskii @ 2020-05-22 13:29 ` Clément Pit-Claudel 2020-05-22 14:30 ` Eli Zaretskii 2020-05-22 13:56 ` Pip Cet 1 sibling, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-22 13:29 UTC (permalink / raw) To: Eli Zaretskii, Pip Cet; +Cc: alan, emacs-devel On 22/05/2020 09.15, Eli Zaretskii wrote: > I thought the set of all the ligatures is known, and guided by > typography experts. I don't think so, at least not for programming fonts? > Do font designers really support ligatures from > any arbitrary combination of characters? If so, where can I read > about this? Yes; that's what I was alluding to in my example with comment signs and arrows. I think the pictures on https://github.com/tonsky/FiraCode should be illuminating. I hope I'm not misunderstanding your question :/ ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 13:29 ` Clément Pit-Claudel @ 2020-05-22 14:30 ` Eli Zaretskii 2020-05-22 14:34 ` Clément Pit-Claudel 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 14:30 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Fri, 22 May 2020 09:29:57 -0400 > > > Do font designers really support ligatures from > > any arbitrary combination of characters? If so, where can I read > > about this? > > Yes; that's what I was alluding to in my example with comment signs and arrows. I think the pictures on https://github.com/tonsky/FiraCode should be illuminating. > > I hope I'm not misunderstanding your question :/ I was talking about ligatures made from letters, not symbols. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 14:30 ` Eli Zaretskii @ 2020-05-22 14:34 ` Clément Pit-Claudel 2020-05-22 19:01 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-22 14:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: alan, pipcet, emacs-devel On 22/05/2020 10.30, Eli Zaretskii wrote: >> Cc: alan@idiocy.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Fri, 22 May 2020 09:29:57 -0400 >> >>> Do font designers really support ligatures from >>> any arbitrary combination of characters? If so, where can I read >>> about this? >> >> Yes; that's what I was alluding to in my example with comment signs and arrows. I think the pictures on https://github.com/tonsky/FiraCode should be illuminating. >> >> I hope I'm not misunderstanding your question :/ > > I was talking about ligatures made from letters, not symbols. But then how do you handle symbol ligatures? You showed the example below in response to Pip's suggestion of using .+ to support everything that I had mentioned; was that only for letters? What about symbols then? (set-char-table-range composition-function-table '(?f . ?f) (list (vector "ffi" 0 'compose-gstring-for-graphic))) ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 14:34 ` Clément Pit-Claudel @ 2020-05-22 19:01 ` Eli Zaretskii 2020-05-22 19:33 ` Clément Pit-Claudel 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 19:01 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Fri, 22 May 2020 10:34:06 -0400 > > > I was talking about ligatures made from letters, not symbols. > > But then how do you handle symbol ligatures? By using suitable regular expressions. E.g., you could take the list of ligatures in that FiraCode site and convert them into a regexp or a set of regexps. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 19:01 ` Eli Zaretskii @ 2020-05-22 19:33 ` Clément Pit-Claudel 2020-05-22 19:44 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-22 19:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: alan, pipcet, emacs-devel On 22/05/2020 15.01, Eli Zaretskii wrote: >> Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Fri, 22 May 2020 10:34:06 -0400 >> >>> I was talking about ligatures made from letters, not symbols. >> >> But then how do you handle symbol ligatures? > > By using suitable regular expressions. E.g., you could take the list > of ligatures in that FiraCode site and convert them into a regexp or a > set of regexps. Thanks. I don't understand why we need to do this, but if we have technical limitations that force us to add those regular expressions then maybe it's not the end of the world (I understand that there is value in being able to selectively disable ligatures, using regexps or something else, but it seems surprising that we'll need extra Emacs-specific work for each and every font that includes ligatures). ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 19:33 ` Clément Pit-Claudel @ 2020-05-22 19:44 ` Eli Zaretskii 2020-05-22 20:02 ` Clément Pit-Claudel 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 19:44 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Fri, 22 May 2020 15:33:59 -0400 > > >> But then how do you handle symbol ligatures? > > > > By using suitable regular expressions. E.g., you could take the list > > of ligatures in that FiraCode site and convert them into a regexp or a > > set of regexps. > > Thanks. I don't understand why we need to do this I'm not sure I follow. Do you understand why https://github.com/tonsky/FiraCode/wiki/Emacs-instructions includes a long list of strings to be replaced with ligatures? If so, why don't you understand the reason we need to specify similar things when we use automatic compositions? And who is "we" in this case? Users of these features indeed shouldn't need to mess with these long lists of character sequences, but why is it a problem if "we" the Emacs developers provide data bases of such sequences in advance, which user-facing features could use, hiding them behind much easier UI? > it seems surprising that we'll need extra Emacs-specific work for each and every font that includes ligatures). I don't understand how you got to this conclusion. This is true for prettify-symbols-mode, but that's exactly why I don't like that implementation, and why I think automatic compositions are a better way to go. And for automatic compositions we didn't yet decide that any user-level action is needed when you switch to another font, we are still discussing what is involved. Up front, I don't yet see why such font-specific adjustment would be required from users. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 19:44 ` Eli Zaretskii @ 2020-05-22 20:02 ` Clément Pit-Claudel [not found] ` <83mu5z171j.fsf@gnu.org> 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-22 20:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: alan, pipcet, emacs-devel On 22/05/2020 15.44, Eli Zaretskii wrote: >> Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Fri, 22 May 2020 15:33:59 -0400 >> >>>> But then how do you handle symbol ligatures? >>> >>> By using suitable regular expressions. E.g., you could take the list >>> of ligatures in that FiraCode site and convert them into a regexp or a >>> set of regexps. >> >> Thanks. I don't understand why we need to do this > > I'm not sure I follow. Do you understand why > https://github.com/tonsky/FiraCode/wiki/Emacs-instructions includes a > long list of strings to be replaced with ligatures? Yes, I do understand: that's because Emacs' ligature support is currently weaker than other editors, and so you need to jump through hoops to use Fira Code. These hoops include telling Emacs what sequences to turn into ligatures. This problem is specific to Emacs: in other text editors, you just pick the font, and all supported ligatures are used. Importantly, the instructions on that page are a poor workaround that doesn't give you all the features of Fira Code (I don't mean that we couldn't support all of them, as I don't know if that true currently. I just mean that the page shouldn't be understood as providing full support for Fira Code in Emacs). That's why Emacs is in the fairly short list of "Doesn't work" editors, I think. > If so, why don't > you understand the reason we need to specify similar things when we > use automatic compositions? What I don't understand is what it is about Emacs that means that we need special lists of regexps for each new font, while other editors don't need them. > And who is "we" in this case? Users of these features indeed > shouldn't need to mess with these long lists of character sequences, > but why is it a problem if "we" the Emacs developers provide data > bases of such sequences in advance, which user-facing features could > use, hiding them behind much easier UI? We can't provide these data bases in advance, I think. Each font supports a different set of symbol ligatures, and so the list for each font will be different. >> it seems surprising that we'll need extra Emacs-specific work for each and every font that includes ligatures). > > I don't understand how you got to this conclusion. This is true for > prettify-symbols-mode, but that's exactly why I don't like that > implementation, and why I think automatic compositions are a better > way to go. And for automatic compositions we didn't yet decide that > any user-level action is needed when you switch to another font, we > are still discussing what is involved. Up front, I don't yet see why > such font-specific adjustment would be required from users. Each font offers a different set of symbol ligatures: there is no common superset that covers all fonts, except the ".+" regexp that Pip posted earlier. From earlier messages, I understood that we need to specify which character sequences to ligate. So, I conclude that we'll need new work every time a new font comes out, or the ligatures in a font change (every time Fira Code is updated, for example). Since other editors don't need that work, I wonder why it's needed in Emacs. Sorry if I misunderstood something; I don't want to waste anyone's time. Clément. ^ permalink raw reply [flat|nested] 145+ messages in thread
[parent not found: <83mu5z171j.fsf@gnu.org>]
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) [not found] ` <83mu5z171j.fsf@gnu.org> @ 2020-05-23 14:34 ` Clément Pit-Claudel 2020-05-23 16:18 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-23 14:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: alan, pipcet, emacs-devel On 23/05/2020 02.47, Eli Zaretskii wrote: >> Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Fri, 22 May 2020 16:02:22 -0400 >> >> What I don't understand is what it is about Emacs that means that we need special lists of regexps for each new font, while other editors don't need them. > > Emacs doesn't need a special list for each font. I already said that > several times. Please look at some examples of composition rules we > already have, for example the Arabic rules at the very end of > misc-lang.el. Do you see any fonts mentioned there? These rules work > with any font that supports Arabic. The only thing I'm talking about is symbol compositions in programming fonts, and for these, we *will* need a custom list for each font, right? >> Each font offers a different set of symbol ligatures: there is no common superset that covers all fonts, except the ".+" regexp that Pip posted earlier. > > I'm not yet sure this is indeed so. I didn't see any reference which > implies that any combination of 26 ASCII letters could become a > ligature. I think that's where I'm confused. I'm talking of ligatures like -> and =>, which do not involve the 26 ASCII letters. > This is a discussion that didn't yet happen. It is quite possible > that in practice the list of ligatures we want to support is not very > long. E.g., the list in > https://github.com/tonsky/FiraCode/wiki/Emacs-instructions is not > long, and I doubt manu additions to it will ever make sense for us. As I said, this list is incomplete and broken. > And finally, if a given font doesn't support some ligature, the > original characters will be displayed "normally", so nothing is lost, > and there's no need to tune the list of ligatures to each and every > font. I said that as well several times already. As long as you can produce a superset of all ligatures, yes. My claim is that this superset is ".+". Otherwise, how do you handle the fact that Fira Code handles arrows of arbitrary lengths? Or is that different from ligatures? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 14:34 ` Clément Pit-Claudel @ 2020-05-23 16:18 ` Eli Zaretskii 2020-05-23 16:37 ` Clément Pit-Claudel 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-23 16:18 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Sat, 23 May 2020 10:34:23 -0400 > > > Emacs doesn't need a special list for each font. I already said that > > several times. Please look at some examples of composition rules we > > already have, for example the Arabic rules at the very end of > > misc-lang.el. Do you see any fonts mentioned there? These rules work > > with any font that supports Arabic. > > The only thing I'm talking about is symbol compositions in programming fonts, and for these, we *will* need a custom list for each font, right? No, we won't need custom lists. Not if we will use the same character composition machinery as we use now for Arabic and other scripts that require it. > > And finally, if a given font doesn't support some ligature, the > > original characters will be displayed "normally", so nothing is lost, > > and there's no need to tune the list of ligatures to each and every > > font. I said that as well several times already. > > As long as you can produce a superset of all ligatures, yes. My claim is that this superset is ".+". It cannot be literally ".+", if you are talking about symbols, because (a) not every character starts a symbol, and (b) symbols cannot be of arbitrary length. > Otherwise, how do you handle the fact that Fira Code handles arrows > of arbitrary lengths? We won't handle arrows of arbitrary length, no. Not as long as we keep the current design of the display engine. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 16:18 ` Eli Zaretskii @ 2020-05-23 16:37 ` Clément Pit-Claudel 0 siblings, 0 replies; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-23 16:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: alan, pipcet, emacs-devel On 23/05/2020 12.18, Eli Zaretskii wrote: > We won't handle arrows of arbitrary length, no. Not as long as we > keep the current design of the display engine. Ah, OK, then I understand. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 13:15 ` Eli Zaretskii 2020-05-22 13:29 ` Clément Pit-Claudel @ 2020-05-22 13:56 ` Pip Cet [not found] ` <83lflj16jn.fsf@gnu.org> 1 sibling, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-22 13:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Fri, May 22, 2020 at 1:15 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Fri, 22 May 2020 12:52:41 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > > You can request all the ligatures that _can_ be supported; > > > > How do I do that? Opentype fonts can support arbitrary ligatures, such > > as "Zapfino" being a seven-letter ligature. > > I thought the set of all the ligatures is known, and guided by > typography experts. No, that's not how Opentype handles things at all. I just added a "ta" ligature to a font by converting it to ttx format, editing the XML, and converting back to .otf. It works fine. So ad-hoc ligatures certainly are a feature of Opentype. > Do font designers really support ligatures from > any arbitrary combination of characters? If so, where can I read > about this? https://docs.microsoft.com/en-us/typography/opentype/spec/gsub#lookuptype-4-ligature-substitution-subtable The font I'm looking at right now has these: Th, ch, ck, ffh, ffi, ffj, ffk, ffl, ff, fh, fi, fj, fk, fl, ft, tt, tz But I've also come across an example where "fä" was displayed differently, though I'm not sure it used Opentype ligatures. > > > For example, when displaying program source code > > > you'd probably want the various symbols, like -> etc., to produce > > > ligatures, but you most probably won't want "ffi" in a variable name > > > to produce a ligature. > > > > Why not? > > It makes no sense to me. Why ligate them in that use case? Program > source code isn't supposed to behave like typeset human-readable text. Seems like an aesthetic decision. As far as I'm concerned, program source code is typeset human-readable text, it just has different (and possibly better) conventions for typesetting it. I wouldn't choose to use a variable-pitch font for program source code ordinarily, but if I did, I'd want ligatures. > > "For the list of typographical ligatures, see > > > > https://en.wikipedia.org/wiki/Orthographic_ligature#Ligatures_in_Unicode_(Latin_alphabets)" > > > > That's very wrong: typographical ligatures generally aren't assigned > > Unicode codepoints; those that have them usually do so for historical > > reasons. > > Indeed, ligatures don't have to have Unicode codepoints, only some of > them are precomposed. Emacs doesn't need them to have codepoints when > we use auto-composition-mode. The reference is there only to show the > list of ligatures, and I believe the list is full regardless of the > codepoint issue. Can you point me to a larger list of ligatures made > out of ASCII letters? "Th" is mentioned as an example in a few places, and it's not on the list. > But I don't think we should continue with these details before we have > a clear idea of whether the list of possible ligatures is really > infinite, as you seem to imply. I agree. ^ permalink raw reply [flat|nested] 145+ messages in thread
[parent not found: <83lflj16jn.fsf@gnu.org>]
[parent not found: <AF222EA0-FE05-4224-8459-2BF82CE27266@vasilij.de>]
[parent not found: <834ks7110w.fsf@gnu.org>]
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) [not found] ` <834ks7110w.fsf@gnu.org> @ 2020-05-23 11:24 ` Vasilij Schneidermann 2020-05-23 13:04 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Vasilij Schneidermann @ 2020-05-23 11:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, pipcet, emacs-devel [-- Attachment #1: Type: text/plain, Size: 3359 bytes --] > The reason is how the current Emacs display engine is designed: it > cannot pass large substrings of buffer text to the shaping engine > without incurring performance penalties and/or disrupting the way the > layout decisions, as currently designed, work. the current design of > the display engine is that we examine the stuff to be displayed one > grapheme cluster at a time, and make the layout decisions after each > grapheme cluster's metrics is produced. Unless someone begins working > on a new design of the Emacs display, I see no good way for overcoming > these problems, based on what I know about the display code. Thanks for describing the problem in detail. Out of curiosity, is this the same reason why font fallback is handled on a per-script basis for most cases and with carefully chosen ranges for emoji? I see a similar problem there, with updates being necessary for every Unicode release. > Of course, it's possible that I'm missing something in the current display > code, which could luckily allow us to support any ligature made up from any > number of characters without any significant design changes. So please by > all means study the current code, see if something like that is possible, > describe such a possible solution, and I'll gladly admit my mistake. I don't > claim a 110% understanding of all the subtleties of the current code, so it > is perfectly possible that I'm missing something. I don't think it is good > for Emacs to have just one person who knows these details, especially if that > person is myself. We need to enlarge the circle of our experts on this, and > then perhaps a practical solution could present itself. Although I'm > skeptical, to tell the truth. Given your previous explanation, a regex-based approach heuristic is the best we can hope for then. From what I understand the display engine uses a rectangular grid, not unlike what terminal emulators do. Are there any tricks to steal from existing terminal emulators? For example there is an open pull request [1] for alacritty using Harfbuzz and FreeType for ligature support. > If I _am_ right, and the complete solution is impossible, we could, of course > decide that partial solutions based on heuristics are not good enough for us, > and wait for the redesign of the display code. I hope we will not do that, > because IMO partial solutions that satisfy 80% of the needs are much better > than no solutions. That is why I described how this stuff could work under > the current limitations, albeit without supporting every possible use case. > Eventually, this is something the community should decide. The greatest challenge I see with redesigning the display engine is supporting textual terminals. One alternative design would be using something akin to a typesetting engine, like TeX's boxes and glue model or something from the roff family (which is used successfully in terminal emulators for `man`). Another approach is to build upon a browser engine and use copious amounts of CSS and JavaScript to build an editor. Neither is known to be performant and power efficient enough for continuous redisplay. It's no wonder that custom designs are used, for example in GUI toolkits. Maybe that is the way forward? Vasilij [1]: https://patch-diff.githubusercontent.com/raw/alacritty/alacritty/pull/2677.patch [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 11:24 ` Vasilij Schneidermann @ 2020-05-23 13:04 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-23 13:04 UTC (permalink / raw) To: Vasilij Schneidermann; +Cc: cpitclaudel, alan, pipcet, emacs-devel > Date: Sat, 23 May 2020 13:24:12 +0200 > From: Vasilij Schneidermann <mail@vasilij.de> > Cc: emacs-devel@gnu.org, pipcet@gmail.com, cpitclaudel@gmail.com, > alan@idiocy.org > > Out of curiosity, is this the same reason why font fallback is > handled on a per-script basis for most cases and with carefully > chosen ranges for emoji? I see a similar problem there, with > updates being necessary for every Unicode release. No, our font selection machinery is completely separate from text shaping, and is also agnostic to character compositions. Basically, we have a char-table (the one set-fontset-font manipulates) which provides the various fonts to try for every given character, and some very convoluted code (see fontset.c) that implements the logic of how to try the fonts and which fonts to prefer for a character. IOW, the font selection is basically per-character and not per-script. The relation to emoji is that emoji _sequences_ need character composition, and Emacs currently cannot compose characters that aren't supported by the same font. This _is_ related to ligatures etc., as it indeed touches on one of the basic premises of the display engine's iteration through buffer text: we stop wherever the 'face' property of characters changes (and the font is one attribute of the face), then continue after loading and realizing the new face. This is why you see strange artifacts when you press and hold Shift, and then move with arrow keys across the Arabic line in etc/HELLO: the shaping of adjacent characters breaks because we pass only part of the text to the shaper. This is another bug that cannot be fixed cleanly while keeping the current design of the display engine and its low-level method of iteration through text and of producing glyphs. > Given your previous explanation, a regex-based approach heuristic is the best > we can hope for then. From what I understand the display engine uses a > rectangular grid, not unlike what terminal emulators do. It uses a rectangular array of glyphs, not a rectangular grid. The difference is that glyphs can have variable metrics, which breaks the grid concept. IOW, the glyph at coordinates (i, j) in the array and the glyph at (i, j+1) are not necessarily one above the other on display. > Are there any tricks > to steal from existing terminal emulators? For example there is an open pull > request [1] for alacritty using Harfbuzz and FreeType for ligature support. I cannot claim I understood well enough what this attempts to do, but I don't think this is our problem in Emacs. It is not a problem of layout per se -- Emacs is well equipped to deal with layout of glyphs and grapheme clusters that have wildly different metrics (recall that we are able to lay out images of more-or-less arbitrary dimensions on the same line as simple text). The problem is that we make the layout decisions as soon as we have the glyph metrics, on the fly, for each "thing" we need to display. HarfBuzz people would like us to send them the entire paragraph of text, then get it back as a series of glyphs, then make the layout decisions based on that. This would need entirely different algorithms, if not also different data structures; for starters, we'd need to know how to find the paragraph(s) that will end up on display without first trying to display them. And all our redisplay shortcuts and optimizations implicitly also assume the current basic iteration, one character at a time, which can be started at any arbitrary buffer position. > The greatest challenge I see with redesigning the display engine is supporting > textual terminals. Really? Why do you think this to be the greatest challenge? For any model of the display we will come up, TTY frames will always be a proper subset, no? ^ permalink raw reply [flat|nested] 145+ messages in thread
[parent not found: <83eerb145r.fsf@gnu.org>]
[parent not found: <CAOqdjBeef8Fa596raEyBUwv0Zr+41LSiYvHW39EdoaXpyxCXVw@mail.gmail.com>]
[parent not found: <831rnb0zld.fsf@gnu.org>]
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) [not found] ` <831rnb0zld.fsf@gnu.org> @ 2020-05-23 12:36 ` Pip Cet 2020-05-23 14:08 ` Eli Zaretskii 2020-05-23 12:47 ` Ligatures Stefan Monnier 1 sibling, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-23 12:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Sat, May 23, 2020 at 9:28 AM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Sat, 23 May 2020 08:44:22 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > You write: "(b) is not really feasible without redesigning the entire > > Emacs display engine". I don't see how that's true at all. All we need > > is some limited look-ahead. > > We already have look-ahead: that's what the regexp part of the > composition rules are about. That is not the crucial problem. But it's the only problem I see! When you see an IT_CHARACTER, you get some context, hand it to HarfBuzz, slice up the relevant glyphs, and display them. This is not complicated or difficult, except for the "get some context" part. It doesn't involve composite.c at all, and that's good, because for those tricky special cases composite.c does a better job than standard shaping, and we need to keep that feature. It just shouldn't be the regular route. > The crucial problem is that we currently perform layout decisions one > grapheme cluster at a time, whereas what HarfBuzz people say is that > we should basically do that one screen line at a time. I think we're going to have to compromise: that's why my patch used a 32-character context rather than an entire line or just a single character. Ideally, of course, in most real cases we'd use whitespace-delimited words as chunks. That's mere optimization, though. > A secondary (but important) problem is that character composition > involves calls to Lisp, which is relatively slow. This precludes > calling the shaper for too many characters at once, too many times for > each redisplay cycle of a window. I agree we shouldn't go through Lisp. My patch didn't. Calling the shaper less often is an important optimization, too. For whitespace-delimited words, we only need to call it once. > > I think at the heart of it, it's about whether we treat fonts like > > pieces of software, to be given a specific task and fixed if they fail > > to perform it, or as bitmaps for simulating a TTY. Fonts are software: > > they're written in a weird limited language, but essentially they're > > programs to measure and display characters as glyphs. > > I don't think there's any disagreements on this high and abstract > level. I think there are: if we treat fonts as programs, we need to let them do their job, which involves kerning, substitutions, ligatures, and even crazy stuff like randomizing the glyph used for each character to get a more hand-written appearance. We don't need to know about ligatures, we just let the font do it. No Lisp callbacks, just a call to harfbuzz. > The problem is how to support that within the limits of the > current design of the display engine. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 12:36 ` Pip Cet @ 2020-05-23 14:08 ` Eli Zaretskii 2020-05-23 15:13 ` Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-23 14:08 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Sat, 23 May 2020 12:36:56 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > You write: "(b) is not really feasible without redesigning the entire > > > Emacs display engine". I don't see how that's true at all. All we need > > > is some limited look-ahead. > > > > We already have look-ahead: that's what the regexp part of the > > composition rules are about. That is not the crucial problem. > > But it's the only problem I see! Then maybe I don't understand what you mean by look-ahead. Is that the decision how to choose those 32 characters of "context"? Then why not use the current regexp-based approach, which is already much smarter than just blindly taking a fixed amount of surrounding text? > When you see an IT_CHARACTER, you get some context, hand it to > HarfBuzz, slice up the relevant glyphs, and display them. The problem is, of course, in the "some context" part. Your patch used an arbitrary 32-character chunk of text around the character to shape, which is of course not what the shaping engines want: they want _all_ of the surrounding text, the entire paragraph. Your patch also invokes the shaper twice, on the same 32 characters, once in encode_char method and again in the text_extents method, which is another waste. The code in composite.c caches the composed characters to avoid that, but you bypass it. This is okay for showing the concept, but we cannot use this in production. There are too many arbitrary decisions and inefficient expensive operations. > It doesn't involve composite.c at all, and that's good, because for > those tricky special cases composite.c does a better job than standard > shaping, and we need to keep that feature. It just shouldn't be the > regular route. Of course, you never tell how to distinguish between the "tricky special cases" for which we still need to use composite.c and friends, and the other kind. Moreover, the HarfBuzz guys clearly say that what we do now is wrong for those "tricky" cases as well, so if we are going to fix that, why fix it only for ligatures made out of ASCII characters? > > The crucial problem is that we currently perform layout decisions one > > grapheme cluster at a time, whereas what HarfBuzz people say is that > > we should basically do that one screen line at a time. > > I think we're going to have to compromise: that's why my patch used a > 32-character context rather than an entire line or just a single > character. If we are going to compromise, then why not compromise on what we already have, which is much less than 32 characters? Why should we enormously complicate and slow down our code without actually solving the problem? Did you ever see ligatures that are 32-character long? > Ideally, of course, in most real cases we'd use whitespace-delimited > words as chunks. That's mere optimization, though. That'd be the wrong optimization, AFAIK. E.g., some scripts don't have whitespace separated words at all, and still need shaping. And what exactly is whitespace for this purpose? e.g., does it include Unicode control characters such as ZWJ? > > A secondary (but important) problem is that character composition > > involves calls to Lisp, which is relatively slow. This precludes > > calling the shaper for too many characters at once, too many times for > > each redisplay cycle of a window. > > I agree we shouldn't go through Lisp. My patch didn't. Your patch hard-codes arbitrary numbers without any way to control that from Lisp. Such code will never fly in Emacs. > Calling the shaper less often is an important optimization, too. For > whitespace-delimited words, we only need to call it once. This doesn't work when the produced sequence of glyphs doesn't fit on the screen line. What the current layout code does in this case won't work well when you need to break a long sequence of glyphs in the middle and then continue on the next line from where you left off on this one. The longer the sequence of glyphs you get from the shaper in one go, the higher the probability of hitting this issue. The bottom line of this is that I think you will find very quickly that the basic assumptions of the current design -- that we produce single glyphs or very short sequences of them for each call to the shaper -- that these assumptions bite you on every step, because the code which deals with layout implicitly assumes this. In short, I really don't see how this could ever work, except in a very limited set of simple use cases. E.g., what do you do with bidirectional text? ignore it? > > I don't think there's any disagreements on this high and abstract > > level. > > I think there are: if we treat fonts as programs, we need to let them > do their job, which involves kerning, substitutions, ligatures, and > even crazy stuff like randomizing the glyph used for each character to > get a more hand-written appearance. We don't need to know about > ligatures, we just let the font do it. No Lisp callbacks, just a call > to harfbuzz. I think this is a simplistic view of how the display engine works, and I don't see how it could work in production while supporting all the use cases we already do. I could be wrong, though, so I'm looking forward to see you present a series of patches that do support the existing use cases and the ligatures as well, and don't cause any slowdown in redisplay. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 14:08 ` Eli Zaretskii @ 2020-05-23 15:13 ` Pip Cet 2020-05-23 16:34 ` Eli Zaretskii 2020-05-23 17:32 ` Eli Zaretskii 0 siblings, 2 replies; 145+ messages in thread From: Pip Cet @ 2020-05-23 15:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Sat, May 23, 2020 at 2:08 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Sat, 23 May 2020 12:36:56 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > > > You write: "(b) is not really feasible without redesigning the entire > > > > Emacs display engine". I don't see how that's true at all. All we need > > > > is some limited look-ahead. > > > > > > We already have look-ahead: that's what the regexp part of the > > > composition rules are about. That is not the crucial problem. > > > > But it's the only problem I see! > > Then maybe I don't understand what you mean by look-ahead. Is that > the decision how to choose those 32 characters of "context"? Yes. > Then why > not use the current regexp-based approach, which is already much > smarter than just blindly taking a fixed amount of surrounding text? Because I do not know the regexp to use? > > When you see an IT_CHARACTER, you get some context, hand it to > > HarfBuzz, slice up the relevant glyphs, and display them. > > The problem is, of course, in the "some context" part. Your patch > used an arbitrary 32-character chunk of text around the character to > shape, which is of course not what the shaping engines want: they want > _all_ of the surrounding text, the entire paragraph. Which is clearly too expensive to actually give them, which is something I didn't think it was necessary to even spell out. > Your patch also invokes the shaper twice, on the same 32 characters, > once in encode_char method and again in the text_extents method, which > is another waste. The code in composite.c caches the composed > characters to avoid that, but you bypass it. Absolutely. > This is okay for showing the concept, but we cannot use this in > production. There are too many arbitrary decisions and inefficient > expensive operations. I agree, of course! In fact, the 32-character limit was chosen as a reminder to myself that things would inherently be inefficient. > > It doesn't involve composite.c at all, and that's good, because for > > those tricky special cases composite.c does a better job than standard > > shaping, and we need to keep that feature. It just shouldn't be the > > regular route. > > Of course, you never tell how to distinguish between the "tricky > special cases" for which we still need to use composite.c and friends, > and the other kind. The tricky special cases get handled as before, and come in with the iterator .what set to IT_COMPOSITE. The standard cases come in with .what set to IT_CHARACTER. > Moreover, the HarfBuzz guys clearly say that what we do now is wrong > for those "tricky" cases as well, so if we are going to fix that, why > fix it only for ligatures made out of ASCII characters? There's no such limitation, but, yes, ideally people would find they don't need automatic compositions anymore... > > > The crucial problem is that we currently perform layout decisions one > > > grapheme cluster at a time, whereas what HarfBuzz people say is that > > > we should basically do that one screen line at a time. > > > > I think we're going to have to compromise: that's why my patch used a > > 32-character context rather than an entire line or just a single > > character. > > If we are going to compromise, then why not compromise on what we > already have, which is much less than 32 characters? 0 characters? > Why should we > enormously complicate and slow down our code without actually solving > the problem? We shouldn't. > Did you ever see ligatures that are 32-character long? "Zapfino" is the longest I've seen. > > Ideally, of course, in most real cases we'd use whitespace-delimited > > words as chunks. That's mere optimization, though. > > That'd be the wrong optimization, AFAIK. Sure, but since it is exclusively an optimization, it's performance considerations alone that will decide whether it is. > E.g., some scripts don't > have whitespace separated words at all, and still need shaping. Thus "most". > And > what exactly is whitespace for this purpose? e.g., does it include > Unicode control characters such as ZWJ? Thankfully, that doesn't matter much: it's just a question of what we optimize for, not one of what the results will look like. So I'd say " ", "\t", and "\n" are enough, which is what the display engine already handles specially. > > > A secondary (but important) problem is that character composition > > > involves calls to Lisp, which is relatively slow. This precludes > > > calling the shaper for too many characters at once, too many times for > > > each redisplay cycle of a window. > > > > I agree we shouldn't go through Lisp. My patch didn't. > > Your patch hard-codes arbitrary numbers without any way to control > that from Lisp. Yes. > Such code will never fly in Emacs. Of course not. > > Calling the shaper less often is an important optimization, too. For > > whitespace-delimited words, we only need to call it once. > > This doesn't work when the produced sequence of glyphs doesn't fit on > the screen line. > What the current layout code does in this case won't > work well when you need to break a long sequence of glyphs in the > middle and then continue on the next line from where you left off on > this one. You mean in visual-mode? Because what the current layout code does by default is to break along any glyph boundary, and I don't see how that's broken in any way. > The longer the sequence of glyphs you get from the shaper > in one go, the higher the probability of hitting this issue. You break between the glyphs. It doesn't depend on whether you have two or 20 or 100. > The bottom line of this is that I think you will find very quickly > that the basic assumptions of the current design -- that we produce > single glyphs or very short sequences of them for each call to the > shaper -- that these assumptions bite you on every step, because the > code which deals with layout implicitly assumes this. The shaper interface I described would actually return a single glyph for each top-level call, with a number of callbacks to provide context. So that assumption would hold up very well indeed... > In short, I really don't see how this could ever work, except in a > very limited set of simple use cases. E.g., what do you do with > bidirectional text? ignore it? A bidi boundary is a hard boundary for HarfBuzz, and no shaping happens across it. Is that what you mean by "ignore it"? > > > I don't think there's any disagreements on this high and abstract > > > level. > > > > I think there are: if we treat fonts as programs, we need to let them > > do their job, which involves kerning, substitutions, ligatures, and > > even crazy stuff like randomizing the glyph used for each character to > > get a more hand-written appearance. We don't need to know about > > ligatures, we just let the font do it. No Lisp callbacks, just a call > > to harfbuzz. > > I think this is a simplistic view of how the display engine works, Quite possibly :-) > and > I don't see how it could work in production while supporting all the > use cases we already do. It only comes in for use cases not handled otherwise, i.e. those where the iterator is at an IT_CHARACTER. All other use cases are unaffected, because they mean we're overriding the font decision anyway. As I said, the problem I have is to get look-ahead working, which you think isn't a problem. I've got an idea for it, but it doesn't work (yet); my theory is the bidi.c code fails to keep its state in the iterator and can't deal with multiple parallel iterators. > I could be wrong, though, so I'm looking > forward to see you present a series of patches that do support the > existing use cases and the ligatures as well, and don't cause any > slowdown in redisplay. As I said, what's stopping me is the look-ahead problem, and in particular some code in bidi.c that doesn't play along well with look-ahead. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 15:13 ` Pip Cet @ 2020-05-23 16:34 ` Eli Zaretskii 2020-05-23 22:38 ` Pip Cet 2020-05-23 17:32 ` Eli Zaretskii 1 sibling, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-23 16:34 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Sat, 23 May 2020 15:13:38 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > Calling the shaper less often is an important optimization, too. For > > > whitespace-delimited words, we only need to call it once. > > > > This doesn't work when the produced sequence of glyphs doesn't fit on > > the screen line. > > > What the current layout code does in this case won't > > work well when you need to break a long sequence of glyphs in the > > middle and then continue on the next line from where you left off on > > this one. > > You mean in visual-mode? Not just in visual-line-mode, but also for the default line continuation. > Because what the current layout code does by default is to break > along any glyph boundary, and I don't see how that's broken in any > way. The code assumes that breaking on some glyph leaves the buffer iterator ('struct it') in a state that we can simply continue to the next buffer position. But if you already picked up several characters via look-ahead, that is not true, and you will have to return back several character positions, in order to continue on the next screen line. The whole convoluted logic of display_line (and a similar one in move_it_in_display_line_to) is based on the assumption that this line-wrap decisions are made as soon as a single glyph is produced; that code will need to be rewritten if this assumption breaks. And since the code is already hairy, to say the least, I cannot even imagine what it will look like after such rewriting. This is just a small example of how deep are the current design assumptions entrenched in the code. I don't see how this can be resolved to yield code that is readable and maintainable without changing the design. Again, maybe I'm missing something. > > In short, I really don't see how this could ever work, except in a > > very limited set of simple use cases. E.g., what do you do with > > bidirectional text? ignore it? > > A bidi boundary is a hard boundary for HarfBuzz, and no shaping > happens across it. Is that what you mean by "ignore it"? I don't mean the boundary, I meant the fact that clusters need to be reordered. > > I don't see how it could work in production while supporting all the > > use cases we already do. > > It only comes in for use cases not handled otherwise, i.e. those where > the iterator is at an IT_CHARACTER. All other use cases are > unaffected, because they mean we're overriding the font decision > anyway. I see no reason to add such patches just to handle some simple enough use cases. If we want the shaper to handle all the text we display, we should go all the way and do it for any text, ASCII, non-ASCII, symbols, emoji, everything. The current codebase is already very difficult to understand and modify; you seem to suggest to make it even more so, and on top of that solve only a small part of the underlying problem. That makes very little sense to me. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 16:34 ` Eli Zaretskii @ 2020-05-23 22:38 ` Pip Cet 2020-05-24 15:33 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-23 22:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Sat, May 23, 2020 at 4:34 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Sat, 23 May 2020 15:13:38 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > Because what the current layout code does by default is to break > > along any glyph boundary, and I don't see how that's broken in any > > way. > > The code assumes that breaking on some glyph leaves the buffer > iterator ('struct it') in a state that we can simply continue to the > next buffer position. Yes. I see no reason to change that. > But if you already picked up several characters > via look-ahead, that is not true, and you will have to return back > several character positions, in order to continue on the next screen > line. You're describing why look-ahead is difficult: a while ago, you appeared to be saying it wasn't. This confuses me. Obviously, when I say "look-ahead", I mean receiving the next display elements an iterator would produce if it were actually advanced, without advancing it. An easy, but potentially slow, way of doing that is to copy the iterator to a new one, advance that, retrieve the display elements, then throw away the copied iterator and return. > The whole convoluted logic of display_line (and a similar one > in move_it_in_display_line_to) is based on the assumption that this > line-wrap decisions are made as soon as a single glyph is produced; > that code will need to be rewritten if this assumption breaks. I see no reason to break that assumption. > And > since the code is already hairy, to say the least, I cannot even > imagine what it will look like after such rewriting. Good thing I'm not planning to do that, then. > This is just a small example of how deep are the current design > assumptions entrenched in the code. One I don't understand, because those fundamental design assumptions aren't something I'm willing to break at this point. > > > In short, I really don't see how this could ever work, except in a > > > very limited set of simple use cases. E.g., what do you do with > > > bidirectional text? ignore it? > > > > A bidi boundary is a hard boundary for HarfBuzz, and no shaping > > happens across it. Is that what you mean by "ignore it"? > > I don't mean the boundary, I meant the fact that clusters need to be > reordered. I see no fundamental problem there, certainly not of the "I don't see how this could ever work" variety. > > > I don't see how it could work in production while supporting all the > > > use cases we already do. > > > > It only comes in for use cases not handled otherwise, i.e. those where > > the iterator is at an IT_CHARACTER. All other use cases are > > unaffected, because they mean we're overriding the font decision > > anyway. > > I see no reason to add such patches just to handle some simple enough > use cases. If it's so simple to get ligatures and kerning right, please tell me how to do it. > If we want the shaper to handle all the text we display, Do we? A while back you said Lisp control over compositions was an important feature, and I'm inclined to think we shouldn't break the existing composition code. > we should go all the way and do it for any text, ASCII, non-ASCII, > symbols, emoji, everything. Are you suggesting I'm somehow limiting myself to ASCII? Let me assure you that's not the case. > The current codebase is already very > difficult to understand and modify; I agree with that. > you seem to suggest to make it > even more so, Well, yes, it's not going to be a free feature. The changes are comparatively tiny compared to what else has been done to xdisp.c. > and on top of that solve only a small part of the > underlying problem. Ligatures and kerning (right now, for LTR text). Is that a small problem because of the lack of RTL support? ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 22:38 ` Pip Cet @ 2020-05-24 15:33 ` Eli Zaretskii 2020-05-26 18:13 ` Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-24 15:33 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Sat, 23 May 2020 22:38:18 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > On Sat, May 23, 2020 at 4:34 PM Eli Zaretskii <eliz@gnu.org> wrote: > > > From: Pip Cet <pipcet@gmail.com> > > > Date: Sat, 23 May 2020 15:13:38 +0000 > > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > Because what the current layout code does by default is to break > > > along any glyph boundary, and I don't see how that's broken in any > > > way. > > > > The code assumes that breaking on some glyph leaves the buffer > > iterator ('struct it') in a state that we can simply continue to the > > next buffer position. > > Yes. I see no reason to change that. > > > But if you already picked up several characters > > via look-ahead, that is not true, and you will have to return back > > several character positions, in order to continue on the next screen > > line. > > You're describing why look-ahead is difficult: a while ago, you > appeared to be saying it wasn't. This confuses me. > > Obviously, when I say "look-ahead", I mean receiving the next display > elements an iterator would produce if it were actually advanced, > without advancing it. That's not what you said earlier: > > > > > You write: "(b) is not really feasible without redesigning the entire > > > > > Emacs display engine". I don't see how that's true at all. All we need > > > > > is some limited look-ahead. > > > > > > > > We already have look-ahead: that's what the regexp part of the > > > > composition rules are about. That is not the crucial problem. > > > > > > But it's the only problem I see! > > > > Then maybe I don't understand what you mean by look-ahead. Is that > > the decision how to choose those 32 characters of "context"? > > Yes. Here you said that look-ahead means how to _choose_ the context. Now you are saying something very different: that look-ahead is how to advance the iterator without advancing it. It's a small wonder we are going in circles when the same term is used for two very different things. > > If we want the shaper to handle all the text we display, > > Do we? A while back you said Lisp control over compositions was an > important feature, and I'm inclined to think we shouldn't break the > existing composition code. > > > we should go all the way and do it for any text, ASCII, non-ASCII, > > symbols, emoji, everything. > > Are you suggesting I'm somehow limiting myself to ASCII? Let me assure > you that's not the case. Then I really don't understand what problem are you trying to solve. Let's try again from the beginning: which parts of the code that implements automatic compositions are you trying to avoid, and why? Is that the part that identifies the "context" via regular expressions? If so, then this problem needs to be solved by some alternative; using an arbitrary chosen fixed number of characters is not suitable for production. You haven't yet shown any viable alternative. Assuming that the alternative for selecting the "context" is found, and composite.c is augmented to apply it instead of the regexps, why not use the rest of the automatic composition code to produce the glyphs and display them? The code which does that exists and works, and is tested by years of use. It already solves the problems of look-ahead, of wrapping long lines, and others, including (but not limited to) the dreaded bidi thing. Why reinvent that wheel when we already have it, and it works well? > > and on top of that solve only a small part of the > > underlying problem. > > Ligatures and kerning (right now, for LTR text). Is that a small > problem because of the lack of RTL support? Yes, of course. An acceptable solution should support any text Emacs supports. What's more, we already have the code which implements all that, so I don't understand why you want to bypass it. Please explain. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-24 15:33 ` Eli Zaretskii @ 2020-05-26 18:13 ` Pip Cet 2020-05-26 19:46 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-26 18:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Sun, May 24, 2020 at 3:33 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Sat, 23 May 2020 22:38:18 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > On Sat, May 23, 2020 at 4:34 PM Eli Zaretskii <eliz@gnu.org> wrote: > > > > From: Pip Cet <pipcet@gmail.com> > > > > Date: Sat, 23 May 2020 15:13:38 +0000 > > > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > Because what the current layout code does by default is to break > > > > along any glyph boundary, and I don't see how that's broken in any > > > > way. > > > > > > The code assumes that breaking on some glyph leaves the buffer > > > iterator ('struct it') in a state that we can simply continue to the > > > next buffer position. > > > > Yes. I see no reason to change that. > > > > > But if you already picked up several characters > > > via look-ahead, that is not true, and you will have to return back > > > several character positions, in order to continue on the next screen > > > line. > > > > You're describing why look-ahead is difficult: a while ago, you > > appeared to be saying it wasn't. This confuses me. > > > > Obviously, when I say "look-ahead", I mean receiving the next display > > elements an iterator would produce if it were actually advanced, > > without advancing it. > That's not what you said earlier: I think it is what I said. > > > > > > You write: "(b) is not really feasible without redesigning the entire > > > > > > Emacs display engine". I don't see how that's true at all. All we need > > > > > > is some limited look-ahead. > > > > > > > > > > We already have look-ahead: that's what the regexp part of the > > > > > composition rules are about. That is not the crucial problem. > > > > > > > > But it's the only problem I see! > > > > > > Then maybe I don't understand what you mean by look-ahead. Is that > > > the decision how to choose those 32 characters of "context"? > > > > Yes. > > Here you said that look-ahead means how to _choose_ the context. The distinction escapes me: look-ahead is how to get the context for a character, obviously without ruining any persistent state. I'm puzzled as to what else it could have meant. > > > If we want the shaper to handle all the text we display, > > > > Do we? A while back you said Lisp control over compositions was an > > important feature, and I'm inclined to think we shouldn't break the > > existing composition code. > > > > > we should go all the way and do it for any text, ASCII, non-ASCII, > > > symbols, emoji, everything. > > > > Are you suggesting I'm somehow limiting myself to ASCII? Let me assure > > you that's not the case. > > Then I really don't understand what problem are you trying to solve. Ligatures and kerning. > Let's try again from the beginning: which parts of the code that > implements automatic compositions are you trying to avoid, > and why? I'm not trying to avoid any of it! I just see no reason to use any of it, so far, because the part we have in common is about a dozen lines of code around the call to hb_shape. > Is that the part that identifies the "context" via regular > expressions? If so, then this problem needs to be solved by some > alternative; using an arbitrary chosen fixed number of characters is > not suitable for production. I'm puzzled as to how these regular expressions, which only work when they match fixed-length strings, as far as I can tell, are worse than a fixed-length context. You're right that the number shouldn't be hardcoded in Emacs, and shouldn't be arbitrary, but obviously there has to be a limit shorter than a word or paragraph. (The composite.c code currently hardcodes a limit of 500 characters). (And as I've said repeatedly, this is a deficiency specifically in HarfBuzz: the OpenType format makes it very easy to tell what the longest pattern is and how much context is needed. HarfBuzz should pass on that information, ideally by providing an incremental asynchronous API that requests only as much context as is needed until the glyphs in question can be returned.) > You haven't yet shown any viable alternative. To what? We still haven't seen any actual regular expressions that work. You just keep saying "regular expressions" like that's a solution, rather than simply constituting a restriction on the set of possible solutions. And keep in mind that this context is used only for deciding what the "current" glyph looks like: the next glyph will have its own context, which might or might not be different. What I'm currently playing with is something that I'm not sure is even expressible as a regexp: starting with the character at point, keep adding surrounding characters unless doing so would create a delimiter-nondelimiter boundary after the first char, or a nondelimiter-delimiter boundary before the last char, but limit the whole thing to 16 characters each way. As I've explained, it would be much better to let HarfBuzz tell us whether to provide more context, but even then we'd need a cut-off: imagine a file containing a gigabyte of 'f's. > Assuming that the alternative for selecting the "context" is found, > and composite.c is augmented to apply it instead of the regexps, why > not use the rest of the automatic composition code to produce the > glyphs and display them? I chose not to do that for a patch which I have stated repeatedly was not in any way a finalized design, and I don't see any good reason to do it for a real patch, either, so far. (I'll be honest: I strongly suspect that the code is too slow, we know it to be buggy, and it's simply too different from what I actually want to benefit from sharing the code). > The code which does that exists and works, (I suspect: slowly) > and is tested by years of use. It's unusable for me in Emacs 26.3. > It already solves the problems of look-ahead, If it does so efficiently, I'll certainly try reusing that code. But I strongly suspect it doesn't. > of wrapping long lines, Very poorly, for my purposes. > and others, including (but not limited to) the dreaded bidi thing. Looking for "bidi" in composite.c, the only relevant thing I see is a FIXME. > Why reinvent that wheel when we already have it, and it works well? First, because it doesn't work that well for my purposes; second, precisely because it works well for the purposes of others, and I'd like to have as little impact as possible on existing use cases. They should just continue working, and so far they do. > > > and on top of that solve only a small part of the > > > underlying problem. > > > > Ligatures and kerning (right now, for LTR text). Is that a small > > problem because of the lack of RTL support? > > Yes, of course. Why? I honestly don't see what's bad about a patch that improves things for most languages and doesn't affect RTL languages (which, as you point out, have existing support). The code shouldn't break horribly for RTL text (it doesn't). If it works, that's great; if it doesn't work and leaves things unshaped, that's the existing behavior, and auto-composition-mode will still work if enabled. > An acceptable solution should support any text Emacs > supports. By that standard, bidi.c and composite.c are unacceptable. > What's more, we already have the code which implements all > that, so I don't understand why you want to bypass it. We have something that superficially results in a similar screen layout to what I want, but that actually represents display elements in a way that makes them unusable for my purposes. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-26 18:13 ` Pip Cet @ 2020-05-26 19:46 ` Eli Zaretskii 2020-05-27 9:36 ` Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-26 19:46 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Tue, 26 May 2020 18:13:55 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > Assuming that the alternative for selecting the "context" is found, > > and composite.c is augmented to apply it instead of the regexps, why > > not use the rest of the automatic composition code to produce the > > glyphs and display them? > > I chose not to do that for a patch which I have stated repeatedly was > not in any way a finalized design, and I don't see any good reason to > do it for a real patch, either, so far. Why not? How about trying to do that before giving up? > (I'll be honest: I strongly suspect that the code is too slow, we know > it to be buggy, and it's simply too different from what I actually > want to benefit from sharing the code). > > > The code which does that exists and works, > > (I suspect: slowly) Any measurements to back that up? E.g., is scrolling through etc/HELLO especially slow, once all the fonts were loaded (i.e. each character in the file was displayed at least once)? > > and is tested by years of use. > > It's unusable for me in Emacs 26.3. How so? what doesn't work? (And why are you using Emacs 26 and not Emacs 27, where we support HarfBuzz and made several improvements and bugfixes in the character composition area?) > > It already solves the problems of look-ahead, > > If it does so efficiently, I'll certainly try reusing that code. But I > strongly suspect it doesn't. Why suspect? why not try and see what does and doesn't work, what is and isn't efficient? > > of wrapping long lines, > > Very poorly, for my purposes. How so? what doesn't wrap correctly, and why? > > and others, including (but not limited to) the dreaded bidi thing. > > Looking for "bidi" in composite.c, the only relevant thing I see is a FIXME. That's because you look in the wrong place. Once again, try looking at etc/HELLO, there are portions of it that need both bidi and compositions. I can explain how it works (the code is spread over several files), but please believe me that it does, it passed the HarfBuzz developers' eyes most of whom are native Arabic and Farsi speakers, and wouldn't allow us to display Arabic script incorrectly. The whole point of using the existing code is that you don't _need_ to understand how exactly we handle the bidi reordering when character compositions are required. It just works, for all you care. It did take several iterations to get right at the time; why would you want to repeat all that, when the code is there to use and extend? > > Why reinvent that wheel when we already have it, and it works well? > > First, because it doesn't work that well for my purposes; What doesn't work? please be specific. > second, precisely because it works well for the purposes of others, > and I'd like to have as little impact as possible on existing use > cases. They should just continue working, and so far they do. You are thinking of breaking those other cases by your changes? But we haven't yet established that changes are needed, let alone which changes. How do you know you will break anything at all? > > > Ligatures and kerning (right now, for LTR text). Is that a small > > > problem because of the lack of RTL support? > > > > Yes, of course. > > Why? Because the features you are talking about should "just work" in Emacs. Not only for some use cases and some scripts -- that is not how we develop features. Features that work only for some cases are broken and will draw bug reports. They make Emacs look unclean and unprofessional. And there's no need to add such half-broken features because code that supports much broader class of use cases already exists, you just need to use it and maybe extend and augment it a bit. > The code shouldn't break horribly for RTL text (it doesn't). It _will_ break for RTL text, you just didn't yet see it because you only tested it in simple use cases. UAX#9 defines a lot of optional features, including multi-level directional overrides and embeddings, it isn't just right-to-left vs left-to-right. Again, there's no need for you to reinvent this wheel, we already have it figured out. > > What's more, we already have the code which implements all > > that, so I don't understand why you want to bypass it. > > We have something that superficially results in a similar screen > layout to what I want, but that actually represents display elements > in a way that makes them unusable for my purposes. Then please describe what doesn't fit your purpose, and let's focus on extending the existing code to do what's missing. Throwing everything away and starting anew is not the right way, it's a huge waste of energy and time to implement something that we already have. It is also a maintenance burden in the long run. Please note: I'm not talking about the regexp part -- that part you anyway will need to decide how to extend or augment. I'm telling you right here and now that blindly taking a fixed amount of surrounding text will not be acceptable. You can either come up with some smarter regexp (and you are wrong: the regexps in composition-function-table do NOT have to match only fixed strings, you can see that they don't in the part of the table we set up for the Arabic script); or you can decide on something more complex, like a function. Either way, the amount of text that this will pick up and pass to the shaper should be reasonable and should be determined by some understandable rules. And those rules must be controllable from Lisp. But that is a separate part of the problem that you will need to solve, and you will need to solve it whether or not you use character compositions. What I _am_ saying is that the rest of the machinery that implements automatic compositions does exactly what you need: it calls the shaper, handling LTR and RTL text as needed, then lays out the glyphs the shaper returns in a way that handles all the usual stuff our users expect, such as line wrapping and truncation. It is silly to disregard that code, so please don't. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-26 19:46 ` Eli Zaretskii @ 2020-05-27 9:36 ` Pip Cet 2020-05-27 17:13 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-27 9:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On Tue, May 26, 2020 at 7:46 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Tue, 26 May 2020 18:13:55 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > > Assuming that the alternative for selecting the "context" is found, > > > and composite.c is augmented to apply it instead of the regexps, why > > > not use the rest of the automatic composition code to produce the > > > glyphs and display them? > > > > I chose not to do that for a patch which I have stated repeatedly was > > not in any way a finalized design, and I don't see any good reason to > > do it for a real patch, either, so far. > > Why not? Which part are you asking about? I don't see any good reason because I've read the composite.c code (I'm not ignoring it), with an eye to reusing what's reusable, and come up empty. But you've convinced me I need to do a careful rereading. > > > The code which does that exists and works, > > > > (I suspect: slowly) > > Any measurements to back that up? Yes. With a regexp of "....", the composite.c code takes 175 billion cycles to display every line of composite.c. My code takes 144 billion cycles, with a lookahead/lookbehind each set to 128 but limiting it as described. > E.g., is scrolling through > etc/HELLO especially slow, once all the fonts were loaded (i.e. each > character in the file was displayed at least once)? > (And why are you using Emacs 26 and not > Emacs 27, where we support HarfBuzz and made several improvements and > bugfixes in the character composition area?) Because I was trying to test your implication that all this was usable years ago. It wasn't. I'm not using Emacs 26 :-) > > > It already solves the problems of look-ahead, > > > > If it does so efficiently, I'll certainly try reusing that code. But I > > strongly suspect it doesn't. > > Why suspect? why not try and see what does and doesn't work, what is > and isn't efficient? I have, now, coming up with the above measurement which confirms my suspicion. > > > and others, including (but not limited to) the dreaded bidi thing. > > > > Looking for "bidi" in composite.c, the only relevant thing I see is a FIXME. > > That's because you look in the wrong place. What's the right place? I'm using all the code in bidi.c, of course, so as far as I can tell what I'm not doing is using composite.c... > Once again, try looking > at etc/HELLO, there are portions of it that need both bidi and > compositions. I can explain how it works (the code is spread over > several files), but please believe me that it does, it passed the > HarfBuzz developers' eyes most of whom are native Arabic and Farsi > speakers, and wouldn't allow us to display Arabic script incorrectly. > > The whole point of using the existing code is that you don't _need_ to > understand how exactly we handle the bidi reordering when character > compositions are required. But that's true without using the existing code! > It just works, for all you care. It did > take several iterations to get right at the time; why would you want > to repeat all that, when the code is there to use and extend? > > second, precisely because it works well for the purposes of others, > > and I'd like to have as little impact as possible on existing use > > cases. They should just continue working, and so far they do. > > You are thinking of breaking those other cases by your changes? No! If I break them, that's a severe bug in my code! > But > we haven't yet established that changes are needed, "Enter"ing ligature glyphs is definitely something we need to do before any user can reasonably use variable-pitch fonts with ligatures for displaying English text. Kerning is another such thing. Both don't work with the current code. > Because the features you are talking about should "just work" in > Emacs. > Not only for some use cases and some scripts -- that is not > how we develop features. Features that work only for some cases are > broken and will draw bug reports. They make Emacs look unclean and > unprofessional. Not as much as the current lack of support does. > And there's no need to add such half-broken features because code that > supports much broader class of use cases already exists, you just need > to use it and maybe extend and augment it a bit. I don't think I agree with the "a bit". > > The code shouldn't break horribly for RTL text (it doesn't). > > It _will_ break for RTL text, you just didn't yet see it because you > only tested it in simple use cases. UAX#9 defines a lot of optional > features, including multi-level directional overrides and embeddings, > it isn't just right-to-left vs left-to-right. I assume bidi.c handles that, as it does for composite.c? > > > What's more, we already have the code which implements all > > > that, so I don't understand why you want to bypass it. > > > > We have something that superficially results in a similar screen > > layout to what I want, but that actually represents display elements > > in a way that makes them unusable for my purposes. > > Then please describe what doesn't fit your purpose, and let's focus on > extending the existing code to do what's missing. The three main things are: - "entering" glyphs, instead of treating them as atomic - providing context automatically rather than by providing specific regexps for it in advance - kerning, which requires context for every character Secondary concerns: - ligatures that come partly from a display property and partly from the buffer (composite.c doesn't allow for those, as far as I can tell) > Please note: I'm not talking about the regexp part -- that part you > anyway will need to decide how to extend or augment. I'm telling you > right here and now that blindly taking a fixed amount of surrounding > text will not be acceptable. You can either come up with some smarter > regexp (and you are wrong: the regexps in composition-function-table > do NOT have to match only fixed strings, you can see that they don't > in the part of the table we set up for the Arabic script); Again, I think the limits are fixed: 4 characters of history and 500 characters of look-ahead. What am I missing? > or you can > decide on something more complex, like a function. Either way, the > amount of text that this will pick up and pass to the shaper should be > reasonable and should be determined by some understandable rules. And > those rules must be controllable from Lisp. That last part isn't true for the composite.c code, which imposes a limit of 4 characters of history and 500 characters of look-ahead, as far as I can tell. But, sure, if that's a requirement, I'll keep it in mind. > But that is a separate part of the problem that you will need to > solve, and you will need to solve it whether or not you use character > compositions. What I _am_ saying is that the rest of the machinery > that implements automatic compositions does exactly what you need: it > calls the shaper, handling LTR and RTL text as needed, then lays out > the glyphs the shaper returns in a way that handles all the usual > stuff our users expect, such as line wrapping and truncation. > It is silly to disregard that code, so please don't. You've convinced me that it's worth reading it again, more carefully, but I'm not optimistic I'll come to a different conclusion this time around. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-27 9:36 ` Pip Cet @ 2020-05-27 17:13 ` Eli Zaretskii 2020-05-27 18:42 ` Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-27 17:13 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Wed, 27 May 2020 09:36:52 +0000 > Cc: emacs-devel@gnu.org > > > Any measurements to back that up? > > Yes. With a regexp of "....", the composite.c code takes 175 billion > cycles to display every line of composite.c. My code takes 144 billion > cycles, with a lookahead/lookbehind each set to 128 but limiting it as > described. What did you compare, exactly? On the one hand, the code you posted here, which took 128 characters around each character to be displayed? any other changes in the code you posted here? And what does "limiting it as described" mean here? And on the other hand, the existing automatic composition machinery? With what setup of composition-function-table, exactly? And finally, which code was included in the count of cycles? > > > > and others, including (but not limited to) the dreaded bidi thing. > > > > > > Looking for "bidi" in composite.c, the only relevant thing I see is a FIXME. > > > > That's because you look in the wrong place. > > What's the right place? I'm using all the code in bidi.c, of course, No, actually you don't. Your make_context copies characters in strict logical order, bypassing bidi.c, and by that also potentially crossing boundaries of different directionality (and even line and paragraph boundaries), which is a no-no in text shaping. Then, after you call the shaper, you don't reorder the glyphs it delivers, so they will look on display in the wrong order. And there may be other subtle issues as well -- this stuff was finalized so long ago that I'm not even sure I remember all the details of what needed to be done to get it right. > > > The code shouldn't break horribly for RTL text (it doesn't). > > > > It _will_ break for RTL text, you just didn't yet see it because you > > only tested it in simple use cases. UAX#9 defines a lot of optional > > features, including multi-level directional overrides and embeddings, > > it isn't just right-to-left vs left-to-right. > > I assume bidi.c handles that, as it does for composite.c? Yes, but only _if_you_use_them_correctly_! If you bypass them, then all bets are off. > > > We have something that superficially results in a similar screen > > > layout to what I want, but that actually represents display elements > > > in a way that makes them unusable for my purposes. > > > > Then please describe what doesn't fit your purpose, and let's focus on > > extending the existing code to do what's missing. > > The three main things are: > - "entering" glyphs, instead of treating them as atomic Why is that needed? A ligature is a single display entity, that's why fonts ligate. Why would we want to break ligatures when we wrap lines? > - providing context automatically rather than by providing specific > regexps for it in advance That's a separate part of the problem; I wasn't talking about it. It needs a separate solution (which was not yet presented), but the solution doesn't have to be based on regexps if a better or smarter or faster way is available. Extending composition-function-table to support context definition by means other than regexp is easy and doesn't disrupt the way the code works. > - kerning, which requires context for every character That's again about that separate part of the problem, because once the context was determined correctly, the shaper will perform the kerning for you. > - ligatures that come partly from a display property and partly from > the buffer (composite.c doesn't allow for those, as far as I can tell) It doesn't and it shouldn't! Text of display strings and overlay strings is completely isolated from buffer text, and is even bidi-reordered independently. This is by design. These strings are more akin to images than to a part of buffer text, so mixing them with buffer text on display would be a grave mistake. > > Please note: I'm not talking about the regexp part -- that part you > > anyway will need to decide how to extend or augment. I'm telling you > > right here and now that blindly taking a fixed amount of surrounding > > text will not be acceptable. You can either come up with some smarter > > regexp (and you are wrong: the regexps in composition-function-table > > do NOT have to match only fixed strings, you can see that they don't > > in the part of the table we set up for the Arabic script); > > Again, I think the limits are fixed: 4 characters of history and 500 > characters of look-ahead. What am I missing? Fixed limits and fixed strings are two different things. You can match strings of many different lengths up to a limit. The 3 previous characters are rarely needed, certainly not for English ligatures, because you can detect the sequence by the first character. So this is rarely a limitation; but again, it can be expanded if needed with little if any effect on the code. (And where did you see the 500-character limitation of look-ahead?) Anyway, you again focus on the (separate) issue of determining the context. Whereas I was mainly talking about what happens _after_ you determine the context: how do you collect the characters to pass to the shaper, how you present to the layout code the glyphs returned by the shaper, and how you lay out those glyphs by inserting them into the glyph rows of the glyph matrix. It is this code that I see no reason to modify, definitely not significantly. > > or you can > > decide on something more complex, like a function. Either way, the > > amount of text that this will pick up and pass to the shaper should be > > reasonable and should be determined by some understandable rules. And > > those rules must be controllable from Lisp. > > That last part isn't true for the composite.c code, which imposes a > limit of 4 characters of history and 500 characters of look-ahead How do those limits violate the above requirement? The 3-char prev-chars limit is "reasonable" because we currently don't need more, and the other limit doesn't exist AFAICT -- you could make a regexp that matched very long strings, if needed. And the rules to use to set up the regexp are definitely "understandable" and can be controlled from Lisp. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-27 17:13 ` Eli Zaretskii @ 2020-05-27 18:42 ` Pip Cet 2020-05-27 19:19 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-27 18:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On Wed, May 27, 2020 at 5:13 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Wed, 27 May 2020 09:36:52 +0000 > > Cc: emacs-devel@gnu.org > > > > > Any measurements to back that up? > > > > Yes. With a regexp of "....", the composite.c code takes 175 billion > > cycles to display every line of composite.c. My code takes 144 billion > > cycles, with a lookahead/lookbehind each set to 128 but limiting it as > > described. > > What did you compare, exactly? On the one hand, the code you posted > here, which took 128 characters around each character to be displayed? No. Not anything like that code. > any other changes in the code you posted here? And what does > "limiting it as described" mean here? I described the algorithm for selecting context. > And on the other hand, the existing automatic composition machinery? > With what setup of composition-function-table, exactly? As I said, a regexp of "....". > And finally, which code was included in the count of cycles? All of it. There's no reason to believe the composite.c regexp design will perform adequately. It doesn't. > > > > > and others, including (but not limited to) the dreaded bidi thing. > > > > > > > > Looking for "bidi" in composite.c, the only relevant thing I see is a FIXME. > > > > > > That's because you look in the wrong place. > > > > What's the right place? I'm using all the code in bidi.c, of course, > > No, actually you don't. > Your make_context copies characters in strict > logical order, bypassing bidi.c My current code doesn't. > , and by that also potentially crossing > boundaries of different directionality (and even line and paragraph > boundaries), which is a no-no in text shaping. Then, after you call > the shaper, you don't reorder the glyphs it delivers, so they will > look on display in the wrong order. I do now. > And there may be other subtle > issues as well -- this stuff was finalized so long ago that I'm not > even sure I remember all the details of what needed to be done to get > it right. (It's not enough. Open emacs -Q etc/HELLO, place point on the lam in "aleikum", and hit control-space. The shape changes to something incorrect.) > > > > The code shouldn't break horribly for RTL text (it doesn't). > > > > > > It _will_ break for RTL text, you just didn't yet see it because you > > > only tested it in simple use cases. UAX#9 defines a lot of optional > > > features, including multi-level directional overrides and embeddings, > > > it isn't just right-to-left vs left-to-right. > > > > I assume bidi.c handles that, as it does for composite.c? > > Yes, but only _if_you_use_them_correctly_! If you bypass them, then > all bets are off. Obviously. > > > > We have something that superficially results in a similar screen > > > > layout to what I want, but that actually represents display elements > > > > in a way that makes them unusable for my purposes. > > > > > > Then please describe what doesn't fit your purpose, and let's focus on > > > extending the existing code to do what's missing. > > > > The three main things are: > > - "entering" glyphs, instead of treating them as atomic > > Why is that needed? A ligature is a single display entity, that's why > fonts ligate. "ffi" is not. When I enter "official" C-a C-f C-f, point MUST be on the second f. > Why would we want to break ligatures when we wrap > lines? Who said we do? I personally like it, but it's obviously not something we should do by default? > > - providing context automatically rather than by providing specific > > regexps for it in advance > > That's a separate part of the problem; I wasn't talking about it. It > needs a separate solution (which was not yet presented), but the > solution doesn't have to be based on regexps if a better or smarter or > faster way is available. Extending composition-function-table to > support context definition by means other than regexp is easy and > doesn't disrupt the way the code works. > > > - kerning, which requires context for every character > > That's again about that separate part of the problem, because once the > context was determined correctly, the shaper will perform the kerning > for you. > > - ligatures that come partly from a display property and partly from > > the buffer (composite.c doesn't allow for those, as far as I can tell) > > It doesn't and it shouldn't! Text of display strings and overlay > strings is completely isolated from buffer text, and is even > bidi-reordered independently. This is by design. Unacceptable design for my use case, then. I don't see how revealing buffer text that has a replacing display property, rather than the replacement, is good design. The results of putting display properties on autocompositions are...entertaining, in any case. I've now got an "x" character that C-x = tells me is an "i"... > These strings are > more akin to images than to a part of buffer text, so mixing them with > buffer text on display would be a grave mistake. No, it wouldn't be. If two letters appear with no intervening space, they need to be kerned and ligated if appropriate, no matter where they come from. If people want a ZWNJ, that's perfectly available to them. > > > Please note: I'm not talking about the regexp part -- that part you > > > anyway will need to decide how to extend or augment. I'm telling you > > > right here and now that blindly taking a fixed amount of surrounding > > > text will not be acceptable. You can either come up with some smarter > > > regexp (and you are wrong: the regexps in composition-function-table > > > do NOT have to match only fixed strings, you can see that they don't > > > in the part of the table we set up for the Arabic script); > > > > Again, I think the limits are fixed: 4 characters of history and 500 > > characters of look-ahead. What am I missing? > > Fixed limits and fixed strings are two different things. You can > match strings of many different lengths up to a limit. Which effectively means you can match strings of that limited length. > The 3 previous characters are rarely needed, certainly not for English > ligatures, because you can detect the sequence by the first character. Precisely the same argument applies to my 16-character limit. A script in which a glyph depends on something happening 16 codepoints onwards, or back, is extremely unlikely. > Anyway, you again focus on the (separate) issue of determining the > context. Whereas I was mainly talking about what happens _after_ you > determine the context: how do you collect the characters to pass to > the shaper, how you present to the layout code the glyphs returned by > the shaper, and how you lay out those glyphs by inserting them into > the glyph rows of the glyph matrix. It is this code that I see no > reason to modify, definitely not significantly. It needs to be modified, significantly, to support entering glyphs, to support kerning, and to support things like ligating across a buffer text / display string boundary. > > > or you can > > > decide on something more complex, like a function. Either way, the > > > amount of text that this will pick up and pass to the shaper should be > > > reasonable and should be determined by some understandable rules. And > > > those rules must be controllable from Lisp. > > > > That last part isn't true for the composite.c code, which imposes a > > limit of 4 characters of history and 500 characters of look-ahead > > How do those limits violate the above requirement? The 3-char > prev-chars limit is "reasonable" because we currently don't need more, It's hardcoded in C, though. A 16-character limit, as explained above, is perfectly "reasonable" for determining the shape of a single glyph. > and the other limit doesn't exist AFAICT -- you could make a regexp > that matched very long strings, if needed. Hmm. I thought I saw weirdness around the 500th character, but it's probably one of the other bugs. But, seriously, you're still willing to argue that point shouldn't be able to enter the "ffi" glyph? Not even if the user wants it? Because if so, I suggest we interrupt the discussion here. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-27 18:42 ` Pip Cet @ 2020-05-27 19:19 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-27 19:19 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Wed, 27 May 2020 18:42:07 +0000 > Cc: emacs-devel@gnu.org > > > What did you compare, exactly? On the one hand, the code you posted > > here, which took 128 characters around each character to be displayed? > > No. Not anything like that code. Then your numbers cannot be meaningfully reasoned about, because no one knows what you did. > There's no reason to believe the composite.c regexp design will > perform adequately. It doesn't. I guess in your eyes only your code performs adequately. Sorry, this means any further discussion with you on these matters is futile. I regret to have wasted so much time trying to explain how this stuff works. I will try to be smarter next time when you ask some question. > (It's not enough. Open emacs -Q etc/HELLO, place point on the lam in > "aleikum", and hit control-space. The shape changes to something > incorrect.) A known limitation of our handling of faces in conjunction with character composition. Finding the reason is left as an exercise. > > > - "entering" glyphs, instead of treating them as atomic > > > > Why is that needed? A ligature is a single display entity, that's why > > fonts ligate. > > "ffi" is not. When I enter "official" C-a C-f C-f, point MUST be on > the second f. That doesn't require producing separate glyphs. > > It doesn't and it shouldn't! Text of display strings and overlay > > strings is completely isolated from buffer text, and is even > > bidi-reordered independently. This is by design. > > Unacceptable design for my use case, then. This is the design of the Emacs display engine. If it doesn't fit your case, your case cannot be had in Emacs without rewriting the display code. > No, it wouldn't be. If two letters appear with no intervening space, > they need to be kerned and ligated if appropriate, no matter where > they come from. If people want a ZWNJ, that's perfectly available to > them. That's not what display and overlay strings are for in Emacs. > > Fixed limits and fixed strings are two different things. You can > > match strings of many different lengths up to a limit. > > Which effectively means you can match strings of that limited length. Except that there's no limit, of course. > > The 3 previous characters are rarely needed, certainly not for English > > ligatures, because you can detect the sequence by the first character. > > Precisely the same argument applies to my 16-character limit. A script > in which a glyph depends on something happening 16 codepoints onwards, > or back, is extremely unlikely. You are wrong. Please read this: https://lists.freedesktop.org/archives/harfbuzz/2020-May/007517.html https://lists.freedesktop.org/archives/harfbuzz/2020-May/007521.html This is what is needed for doing ligatures The Right Way. Collecting an arbitrary number of codepoint doesn't cut it. And in any case, I was talking about the need to look _backward_, i.e. when the character that triggers the composition is not the first one in the sequence of the characters to be composed. This is usually needed as an optimization: if you have 2-character sequences where the second character is one of a much smaller set than the first, then using the second character as an anchor will use up less memory when you set up composition-function-table. A case in point is a base character and a diacritic. How many characters you need _forward_ is an entirely different issue. > It needs to be modified, significantly, to support entering glyphs, to > support kerning, and to support things like ligating across a buffer > text / display string boundary. Two of these are not needed or are outright wrong, and the third doesn't need anything, the shaper already does that with any text you pass through it. > But, seriously, you're still willing to argue that point shouldn't be > able to enter the "ffi" glyph? Not even if the user wants it? Because > if so, I suggest we interrupt the discussion here. See above. I indeed see no reason to continue this discussion, as evidently any progress here is impossible with your attitude in place. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 15:13 ` Pip Cet 2020-05-23 16:34 ` Eli Zaretskii @ 2020-05-23 17:32 ` Eli Zaretskii 2020-05-23 21:29 ` Pip Cet 1 sibling, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-23 17:32 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Sat, 23 May 2020 15:13:38 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > As I said, the problem I have is to get look-ahead working, which you > think isn't a problem. I've got an idea for it, but it doesn't work > (yet); my theory is the bidi.c code fails to keep its state in the > iterator and can't deal with multiple parallel iterators. > > > I could be wrong, though, so I'm looking > > forward to see you present a series of patches that do support the > > existing use cases and the ligatures as well, and don't cause any > > slowdown in redisplay. > > As I said, what's stopping me is the look-ahead problem, and in > particular some code in bidi.c that doesn't play along well with > look-ahead. I don't think you understand the depth of the issue. If we are going to send large chunks of text to the shaping engine, then none of the insane complexity of bidi.c makes sense; we should simply throw all of it away and use a very different, batch-style reordering code, of the kind you can find in the freebidi library. The sole reason for bidi.c's existence is to produce character positions in the _visual_ order, one position at a time, something that no other bidi-aware editor does. Moreover, not even the basic iteration, one level above bidi.c, where we call get_next_display_element, then PRODUCE_GLYPHS, then set_iterator_to_next -- not even that makes sense. This basic loop exists only because we examine characters one by one, switching from buffer text to overlay or display strings, then back, as needed, and applying faces as we go. Doing this in large chunks calls for a very different structure of the code, and very different separation into layers. This needs to be carefully designed in advance in a clean and well-defined way, not lumped one patch upon another until it kinda works... ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 17:32 ` Eli Zaretskii @ 2020-05-23 21:29 ` Pip Cet 2020-05-24 15:19 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-23 21:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Sat, May 23, 2020 at 5:32 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Sat, 23 May 2020 15:13:38 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > As I said, the problem I have is to get look-ahead working, which you > > think isn't a problem. I've got an idea for it, but it doesn't work > > (yet); my theory is the bidi.c code fails to keep its state in the > > iterator and can't deal with multiple parallel iterators. > > > > > I could be wrong, though, so I'm looking > > > forward to see you present a series of patches that do support the > > > existing use cases and the ligatures as well, and don't cause any > > > slowdown in redisplay. > > > > As I said, what's stopping me is the look-ahead problem, and in > > particular some code in bidi.c that doesn't play along well with > > look-ahead. > > I don't think you understand the depth of the issue. I think I do, actually. It's just that you'd prefer the display engine to be torn out by the roots and rewritten in one fell swoop, but that option isn't currently on the table. > If we are going > to send large chunks of text to the shaping engine, then none of the > insane complexity of bidi.c makes sense; we should simply throw all of > it away and use a very different, batch-style reordering code, of the > kind you can find in the freebidi library. The sole reason for > bidi.c's existence is to produce character positions in the _visual_ > order, one position at a time, something that no other bidi-aware > editor does. Yes, except we're not talking about "large chunks of text" here: somehow you went from "we need only a bunch of regexps to catch ligatures" to "we need to send entire paragraphs to the shaping engine, nothing less will do". My opinion is that we need a reasonable amount of context, often just a single character, and I see no reason to throw out the entire display engine because we want some look-ahead. > Moreover, not even the basic iteration, one level above bidi.c, where > we call get_next_display_element, then PRODUCE_GLYPHS, then > set_iterator_to_next -- not even that makes sense. Again, a single character of lookahead in the typical case, four characters for most ligatures; that doesn't push us over the line to "only a complete rewrite makes sense". > This basic loop > exists only because we examine characters one by one, switching from > buffer text to overlay or display strings, then back, as needed, and > applying faces as we go. Doing this in large chunks calls for a very > different structure of the code, and very different separation into > layers. Indeed. Which is why I'm not talking about doing it in large chunks, at this point. Let's keep doing it character by character but add what little we need to in order to look ahead a little. > This needs to be carefully designed in advance in a clean and > well-defined way, not lumped one patch upon another until it kinda > works... I agree "just start hacking on it with no understanding of the code until things appear to start working" is a bad strategy. So is "first, redesign the universe". To me, it seems like what I want is a reasonable compromise: not large chunks of text, because we can't do that, but some context, enough to do kerning and deal with ligatures. Remember that this discussion started when I mentioned that I was unhappy with HarfBuzz, and I still am, precisely because of its "first, send me your entire document" approach. I don't think it's the right approach to take this design flaw of HarfBuzz for granted and conclude that we need to rewrite the Emacs display engine before we can get English ligatures to display properly. If, that is, we can get look-ahead to work. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-23 21:29 ` Pip Cet @ 2020-05-24 15:19 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-24 15:19 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Sat, 23 May 2020 21:29:32 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > If we are going > > to send large chunks of text to the shaping engine, then none of the > > insane complexity of bidi.c makes sense; we should simply throw all of > > it away and use a very different, batch-style reordering code, of the > > kind you can find in the freebidi library. The sole reason for > > bidi.c's existence is to produce character positions in the _visual_ > > order, one position at a time, something that no other bidi-aware > > editor does. > > Yes, except we're not talking about "large chunks of text" here: > somehow you went from "we need only a bunch of regexps to catch > ligatures" to "we need to send entire paragraphs to the shaping > engine, nothing less will do". The former is what we do now. If you want to treat fonts as software, then the HarfBuzz guys tell us we need to pass all the text through the shaper. > My opinion is that we need a reasonable amount of context, often > just a single character, and I see no reason to throw out the entire > display engine because we want some look-ahead. The problem is to determine how much of surrounding text is needed. The answer I was given was "all of it". So if we want to do it right, that is what we should do. What you propose stops short of that goal, so it's yet another partial solution. Doing that to avoid the need of specifying a fixed set of ligatures in advance sounds like a lot of pain for minimal gain to me. > Remember that this discussion started when I mentioned that I was > unhappy with HarfBuzz, and I still am, precisely because of its > "first, send me your entire document" approach. I'm familiar with 3 shaping engines, and they all behave like that. So this is not an idiosyncrasy of HarfBuzz, it's how text-shaping works in general. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures [not found] ` <831rnb0zld.fsf@gnu.org> 2020-05-23 12:36 ` Pip Cet @ 2020-05-23 12:47 ` Stefan Monnier 2020-05-23 13:10 ` Ligatures Eli Zaretskii ` (2 more replies) 1 sibling, 3 replies; 145+ messages in thread From: Stefan Monnier @ 2020-05-23 12:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, Pip Cet, emacs-devel > The crucial problem is that we currently perform layout decisions one > grapheme cluster at a time, whereas what HarfBuzz people say is that > we should basically do that one screen line at a time. I wonder how it is supposed to work and it works in other applications: Disregarding the theoretical question of whether a font can use ligatures that involve the LF character (and hence affect the definition of what is a line), I still see a chicken-and-egg problems: How do you know where the current "screen line" ends if you don't know how narrow/wide the font and its ligatures will render the text? Do current applications use a heuristic like "ligatures won't reduce the size by more than a factor 2, so estimate the lower bound on the final size to be at most half of what the font metrics say", so they will send up to twice as much text to be shaped as needed, and then they throw away the left overs? Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-23 12:47 ` Ligatures Stefan Monnier @ 2020-05-23 13:10 ` Eli Zaretskii 2020-05-23 13:45 ` Ligatures Stefan Monnier 2020-05-23 13:36 ` Ligatures 조성빈 2020-05-23 14:37 ` Ligatures Pip Cet 2 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-23 13:10 UTC (permalink / raw) To: Stefan Monnier; +Cc: cpitclaudel, alan, pipcet, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Pip Cet <pipcet@gmail.com>, cpitclaudel@gmail.com, alan@idiocy.org, > emacs-devel@gnu.org > Date: Sat, 23 May 2020 08:47:57 -0400 > > I wonder how it is supposed to work and it works in other applications: I have no idea. If someone does, it would be good to hear the details. > Do current applications use a heuristic like "ligatures won't reduce the > size by more than a factor 2, so estimate the lower bound on the final > size to be at most half of what the font metrics say", so they will send > up to twice as much text to be shaped as needed, and then they throw > away the left overs? As I wrote elsewhere, HarfBuzz developers actually prefer to see the entire paragraph, not just screen line, because some shaping decisions depend on that. Not sure what the other applications do about that. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-23 13:10 ` Ligatures Eli Zaretskii @ 2020-05-23 13:45 ` Stefan Monnier 2020-05-23 14:12 ` Ligatures Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Stefan Monnier @ 2020-05-23 13:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, pipcet, emacs-devel >> Do current applications use a heuristic like "ligatures won't reduce the >> size by more than a factor 2, so estimate the lower bound on the final >> size to be at most half of what the font metrics say", so they will send >> up to twice as much text to be shaped as needed, and then they throw >> away the left overs? > As I wrote elsewhere, HarfBuzz developers actually prefer to see the > entire paragraph, not just screen line, because some shaping decisions > depend on that. Not sure what the other applications do about that. But the entire "paragraph" could be 10MB of text?! Sounds like making the "long lines problem" even worse than it already is. Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-23 13:45 ` Ligatures Stefan Monnier @ 2020-05-23 14:12 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-23 14:12 UTC (permalink / raw) To: Stefan Monnier; +Cc: cpitclaudel, alan, pipcet, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: pipcet@gmail.com, cpitclaudel@gmail.com, alan@idiocy.org, > emacs-devel@gnu.org > Date: Sat, 23 May 2020 09:45:12 -0400 > > > As I wrote elsewhere, HarfBuzz developers actually prefer to see the > > entire paragraph, not just screen line, because some shaping decisions > > depend on that. Not sure what the other applications do about that. > > But the entire "paragraph" could be 10MB of text?! Yes. And? > Sounds like making the "long lines problem" even worse than it already is. Presumably, you use other algorithms and data structures to replace the slow parts we have now. But yes, this is one of the problems that would need to be solved by the new display engine. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-23 12:47 ` Ligatures Stefan Monnier 2020-05-23 13:10 ` Ligatures Eli Zaretskii @ 2020-05-23 13:36 ` 조성빈 2020-05-23 14:15 ` Ligatures Stefan Monnier 2020-05-23 14:37 ` Ligatures Pip Cet 2 siblings, 1 reply; 145+ messages in thread From: 조성빈 @ 2020-05-23 13:36 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, cpitclaudel, alan, Pip Cet, emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> 작성: >> The crucial problem is that we currently perform layout decisions one >> grapheme cluster at a time, whereas what HarfBuzz people say is that >> we should basically do that one screen line at a time. > > I wonder how it is supposed to work and it works in other applications: I don’t know how much you know about text rendering, (I’m fairly confident that a previous Emacs maintainer to know more about this than me) but for people who are curious about this, I found the ’Text Rendering Hates You’[0] article which was very helpful for understanding the problem. [0]: https://gankra.github.io/blah/text-hates-you/ > Disregarding the theoretical question of whether a font can use > ligatures that involve the LF character (and hence affect the definition > of what is a line), I still see a chicken-and-egg problems: > How do you know where the current "screen line" ends if you don't know > how narrow/wide the font and its ligatures will render the text? > > Do current applications use a heuristic like "ligatures won't reduce the > size by more than a factor 2, so estimate the lower bound on the final > size to be at most half of what the font metrics say", so they will send > up to twice as much text to be shaped as needed, and then they throw > away the left overs? According to the article I mentioned, it’s just passing the total text repeatedly until it runs out of space. > You have to assume that your text fits on a single line and shape it > until you run out of space. At that point you can perform layout > operations and figure out where to break the text and start the next > line. Repeat until everything is shaped and laid out. > > Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-23 13:36 ` Ligatures 조성빈 @ 2020-05-23 14:15 ` Stefan Monnier 0 siblings, 0 replies; 145+ messages in thread From: Stefan Monnier @ 2020-05-23 14:15 UTC (permalink / raw) To: 조성빈 Cc: Eli Zaretskii, emacs-devel, cpitclaudel, Pip Cet, alan > According to the article I mentioned, it’s just passing the total text > repeatedly until it runs out of space. But wouldn't that inherently imply an O(N²) complexity? Stefan ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-23 12:47 ` Ligatures Stefan Monnier 2020-05-23 13:10 ` Ligatures Eli Zaretskii 2020-05-23 13:36 ` Ligatures 조성빈 @ 2020-05-23 14:37 ` Pip Cet 2 siblings, 0 replies; 145+ messages in thread From: Pip Cet @ 2020-05-23 14:37 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel, cpitclaudel, alan On Sat, May 23, 2020 at 12:48 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote: > > The crucial problem is that we currently perform layout decisions one > > grapheme cluster at a time, whereas what HarfBuzz people say is that > > we should basically do that one screen line at a time. > > I wonder how it is supposed to work and it works in other applications: That's why I'd like us to use a more advanced internal API rather than the limited HarfBuzz API, one that asynchronously requests information about preceding/following codepoints, incrementally informing us of the minimum width already reached, until it can reach a decision. It should be easy enough to put in some heuristics that work in practice until a better shaper comes along... > Do current applications use a heuristic like "ligatures won't reduce the > size by more than a factor 2, so estimate the lower bound on the final > size to be at most half of what the font metrics say", so they will send > up to twice as much text to be shaped as needed, and then they throw > away the left overs? I don't know. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-21 20:51 ` Clément Pit-Claudel 2020-05-21 21:16 ` Pip Cet @ 2020-05-22 11:44 ` Eli Zaretskii 2020-05-22 13:26 ` Clément Pit-Claudel 1 sibling, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 11:44 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Thu, 21 May 2020 16:51:47 -0400 > > On 21/05/2020 15.08, Eli Zaretskii wrote: > > That would prevent Emacs from controlling what is and what isn't > > composed, leaving the shaper in charge. We currently allow Lisp to > > control that via composition-function-table, which provides a regexp > > that text around a character must match in order for the matching > > substring to be passed to the shaper. We never call the shaper unless > > composition-function-table tells us to do so. > > Does this mean that for each font we need to re-encode the font's logic for deciding whether to use a ligature? I don't think so, but I'm not yet sure I understand all the details of the use cases you have in mind. See also my responses to Pip Cet: perhaps they answer also your questions here. > Some concrete examples: in Iosevka (*, (**, (***, (**** etc are all displayed with the * character vertically centered relative to the (, but a lone * is not centered. In Fira Code, punctuation is context-aware, so the "+" in "A + B" is not the same as the "+" in "a + b". In both of these faces, arrows can be of any length, and in Fira Code you can even mix and match them (see https://raw.githubusercontent.com/tonsky/FiraCode/master/extras/arrows.png). How do you solve this in prettify-symbols-mode? In general, I envision that people would use the font they find acceptable for the ligatures they want/need in each mode or buffer where they need that. If for some reason different fonts could determine which ligatures you do NOT want to see, then I guess we will have to provide some easy-to-use UI for that, which would manipulate the relevant data structures under the hood. Alternatively each font could require a separate composition function to go with it. See, this is exactly part of the job that still awaits us: to figure out the various use cases for displaying ligatures in a buffer, and then provide the necessary user-facing features to adapt Emacs to each use case. The infrastructure for this already exists: it's the auto-composition-mode and composition-function-table that underlies it (although we may need to add something so that composition-function-table could be modified on per-buffer basis), but we lack an easy-to-use UI and customization features that will allow users to use that machinery in practice. See the TODFO item about ligatures; volunteers are welcome to work on that. > The documentation of Fira Code does recommend composition-function-table here: https://github.com/tonsky/FiraCode/wiki/Emacs-instructions, but it seems like a lot of extra work for each font, isn't it? That's for static compositions, not for automatic compositions. I was talking about the latter, and consider the former to be a semi-obsolete feature that we should eventually remove. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 11:44 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii @ 2020-05-22 13:26 ` Clément Pit-Claudel 2020-05-22 14:29 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-22 13:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: alan, pipcet, emacs-devel On 22/05/2020 07.44, Eli Zaretskii wrote: >> Cc: alan@idiocy.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Thu, 21 May 2020 16:51:47 -0400 >> >> On 21/05/2020 15.08, Eli Zaretskii wrote: >>> That would prevent Emacs from controlling what is and what isn't >>> composed, leaving the shaper in charge. We currently allow Lisp to >>> control that via composition-function-table, which provides a regexp >>> that text around a character must match in order for the matching >>> substring to be passed to the shaper. We never call the shaper unless >>> composition-function-table tells us to do so. >> >> Does this mean that for each font we need to re-encode the font's logic for deciding whether to use a ligature? > > I don't think so, but I'm not yet sure I understand all the details of > the use cases you have in mind. See also my responses to Pip Cet: > perhaps they answer also your questions here. > >> Some concrete examples: in Iosevka (*, (**, (***, (**** etc are all displayed with the * character vertically centered relative to the (, but a lone * is not centered. In Fira Code, punctuation is context-aware, so the "+" in "A + B" is not the same as the "+" in "a + b". In both of these faces, arrows can be of any length, and in Fira Code you can even mix and match them (see https://raw.githubusercontent.com/tonsky/FiraCode/master/extras/arrows.png). > > How do you solve this in prettify-symbols-mode? You don't, which is unfortunate. prettify-symbols-mode was extremely cool a few years ago when fonts with programming ligatures were mostly unheard of, and it's still extremely nice for things like prettifying lambda in λ, but for things like turning ascii arrows into pretty arrows it lags behind the more recent ligature stuff. > In general, I envision that people would use the font they find > acceptable for the ligatures they want/need in each mode or buffer > where they need that. If for some reason different fonts could > determine which ligatures you do NOT want to see, then I guess we will > have to provide some easy-to-use UI for that, which would manipulate > the relevant data structures under the hood. Alternatively each font > could require a separate composition function to go with it. It would be weird for Emacs to be the only program that requires re-encoding the entire ligature logic of each font it attempts to use. Different fonts offer different ligatures, and if I want to select a subset the font itself provides variants that let me do this. Meanwhile, I hope that we can make Emacs act like browsers or other editors in that if I select a font it will just, by default, use the ligatures that this font provides according to the logic embedded in the font. >> The documentation of Fira Code does recommend composition-function-table here: https://github.com/tonsky/FiraCode/wiki/Emacs-instructions, but it seems like a lot of extra work for each font, isn't it? > > That's for static compositions, not for automatic compositions. I was > talking about the latter, and consider the former to be a > semi-obsolete feature that we should eventually remove. I see. I need to read up on the difference. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 13:26 ` Clément Pit-Claudel @ 2020-05-22 14:29 ` Eli Zaretskii 2020-05-22 14:32 ` Clément Pit-Claudel 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 14:29 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Fri, 22 May 2020 09:26:05 -0400 > > > In general, I envision that people would use the font they find > > acceptable for the ligatures they want/need in each mode or buffer > > where they need that. If for some reason different fonts could > > determine which ligatures you do NOT want to see, then I guess we will > > have to provide some easy-to-use UI for that, which would manipulate > > the relevant data structures under the hood. Alternatively each font > > could require a separate composition function to go with it. > > It would be weird for Emacs to be the only program that requires re-encoding the entire ligature logic of each font it attempts to use. Different fonts offer different ligatures, and if I want to select a subset the font itself provides variants that let me do this. Meanwhile, I hope that we can make Emacs act like browsers or other editors in that if I select a font it will just, by default, use the ligatures that this font provides according to the logic embedded in the font. If this is a real problem, it should be possible to have a function that will extract all the ligatures supported by a font, I think. But I don't think I agree with the "logic embedded in the font" part. I think we should let the user control which ligatures are really used. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 14:29 ` Eli Zaretskii @ 2020-05-22 14:32 ` Clément Pit-Claudel 2020-05-22 19:00 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Clément Pit-Claudel @ 2020-05-22 14:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: alan, pipcet, emacs-devel On 22/05/2020 10.29, Eli Zaretskii wrote: >> Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Fri, 22 May 2020 09:26:05 -0400 >> >>> In general, I envision that people would use the font they find >>> acceptable for the ligatures they want/need in each mode or buffer >>> where they need that. If for some reason different fonts could >>> determine which ligatures you do NOT want to see, then I guess we will >>> have to provide some easy-to-use UI for that, which would manipulate >>> the relevant data structures under the hood. Alternatively each font >>> could require a separate composition function to go with it. >> >> It would be weird for Emacs to be the only program that requires re-encoding the entire ligature logic of each font it attempts to use. Different fonts offer different ligatures, and if I want to select a subset the font itself provides variants that let me do this. Meanwhile, I hope that we can make Emacs act like browsers or other editors in that if I select a font it will just, by default, use the ligatures that this font provides according to the logic embedded in the font. > > If this is a real problem, it should be possible to have a function > that will extract all the ligatures supported by a font, I think. > > But I don't think I agree with the "logic embedded in the font" part. > I think we should let the user control which ligatures are really > used. I agree. We should let them control the logic, but that doesn't mean we have to force them to do so; which means we need a way to extract that logic, somehow. My udnerstanding was that it could be quite complex, so there was no point in re-implementing it in ELisp. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 14:32 ` Clément Pit-Claudel @ 2020-05-22 19:00 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 19:00 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: alan, pipcet, emacs-devel > Cc: pipcet@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Fri, 22 May 2020 10:32:30 -0400 > > > But I don't think I agree with the "logic embedded in the font" part. > > I think we should let the user control which ligatures are really > > used. > > I agree. We should let them control the logic, but that doesn't mean we have to force them to do so; which means we need a way to extract that logic, somehow. If we decide to enable only the ligatures that are supported by the default font, then yes, we should find a way of detecting which ones it supports. But if we find out that the list of the possible ligatures is fixed, we could by default enable all of them, and let the shaping engine deal with those that the font doesn't support. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-21 19:08 ` Eli Zaretskii 2020-05-21 20:51 ` Clément Pit-Claudel @ 2020-05-21 21:06 ` Pip Cet 2020-05-22 6:06 ` Eli Zaretskii 1 sibling, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-21 21:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Thu, May 21, 2020 at 7:08 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Thu, 21 May 2020 16:26:13 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > On Thu, May 21, 2020 at 2:11 PM Eli Zaretskii <eliz@gnu.org> wrote: > > No. I didn't touch the "static compositions" part at all, except for > > passing an extra NULL pointer to an API I'd extended. (At least, > > that's what I intended, for all the changes to be in the IT_CHARACTER > > part). > > I mean this part: > > @@ -30433,8 +30483,9 @@ gui_produce_glyphs (struct it *it) > else > { > get_char_face_and_encoding (it->f, ch, face_id, > - &char2b, false); > - pcm = get_per_char_metric (font, &char2b); > + &char2b, false, > + make_context (it)); > + pcm = get_per_char_metric (font, &char2b, make_context (it)); > } > > This calls make_context and passes it to these functions. This code > handles static compositions only. Oops, sorry. You're right, that change was harmless but unintended; the relevant change is @@ -29989,9 +30033,11 @@ gui_produce_glyphs (struct it *it) it->descent = FONT_DESCENT (font) - boff; } - if (get_char_glyph_code (it->char_to_display, font, &char2b)) + context = make_context (it); + if (get_char_glyph_code (it->char_to_display, font, &char2b, + context)) { - pcm = get_per_char_metric (font, &char2b); + pcm = get_per_char_metric (font, &char2b, context); if (pcm->width == 0 && pcm->rbearing == 0 && pcm->lbearing == 0) pcm = NULL; > > > The "modern" way of composing text in Emacs uses automatic > > > compositions, which are controlled by data in > > > composition-function-table. This is where we call the shaping > > > engine to produce the glyphs according to rules stored in the > > > font. I don't see in your patch any changes that affect ligatures > > > created by automatic compositions; did I miss something? > > > > I don't think so; I went for a third route, that of leaving all > > compositions handling to the shaper and doing none of it in Emacs > > itself. > > But automatic compositions do work by calling the shaper. Yes, that observation is correct. What I'm doing is still very different from the (semi-)automatic compositions composite.c does. > > Perhaps I can digress a little and describe what I think the > > interaction with the shaper should be like: > > > > Emacs: I'd like to display codepoint 'f' > > Harfbuzz: you'll have to tell me the codepoint before that > > Emacs: 'f' > > Harfbuzz: and the one after those two > > Emacs: 'i' > > Harfbuzz: and the one before all of those > > Emacs: That's too expensive for me to compute / it's the beginning of > > paragraph / a bidi boundary / an object without an assigned codepoint > > / ... > > Harfbuzz: okay, display it as the middle slice of the "ffi" glyph > > > > I.e., I'd like Harfbuzz to be asynchronous, and request more > > information, parsimoniously, about the context of the codepoint we're > > describing, rather than working in one go from "complete" information > > to an indefinitely-long line of glyphs. And deal well with us deciding > > it's too expensive to perform that much look-back/look-ahead. (Because > > in real life, ligatures depend on knowing some amount of the context, > > but not all of it, or people could never start writing.) > > That would prevent Emacs from controlling what is and what isn't > composed, leaving the shaper in charge. Well, yes and no: the shaper is in charge, and I see absolutely nothing wrong with that. You can tell the shaper not to perform ligatures (or perform only some of them), or kerning, if you want to. > We currently allow Lisp to > control that via composition-function-table, which provides a regexp > that text around a character must match in order for the matching > substring to be passed to the shaper. And you're suggesting that regexp be set to, say, ".+"? Because that's the only way I've found of getting it to do kerning. > We never call the shaper unless > composition-function-table tells us to do so. ...whereas I want to call it every time, which is why having composition-function-table in the loop seemed wasteful. > I'm not sure I understand what problems do you see with this design. I meant the redisplay engine in general, not the way automatic compositions work. (That's not to say I'm happy with automatic compositions, but that's a different subject). ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-21 21:06 ` Pip Cet @ 2020-05-22 6:06 ` Eli Zaretskii 2020-05-22 9:34 ` Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 6:06 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Thu, 21 May 2020 21:06:27 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > But automatic compositions do work by calling the shaper. > > Yes, that observation is correct. What I'm doing is still very > different from the (semi-)automatic compositions composite.c does. For ligatures, I don't think I understand why the automatic compositions are not the way to go. > > That would prevent Emacs from controlling what is and what isn't > > composed, leaving the shaper in charge. > > Well, yes and no: the shaper is in charge, and I see absolutely > nothing wrong with that. You can tell the shaper not to perform > ligatures (or perform only some of them), or kerning, if you want to. Tell it how? by introducing new Lisp options and data structures? What would those new data structures be, and how will they be different from composition-function-table? > > We currently allow Lisp to > > control that via composition-function-table, which provides a regexp > > that text around a character must match in order for the matching > > substring to be passed to the shaper. > > And you're suggesting that regexp be set to, say, ".+"? Because that's > the only way I've found of getting it to do kerning. I'm not talking about the kerning. This discussion is about ligatures, AFAIU. For ligatures, the regexp should catch the sequences of characters that should be ligated. ".+" is definitely not right for ligatures, since it will significantly slow down redisplay for no good reason. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 6:06 ` Eli Zaretskii @ 2020-05-22 9:34 ` Pip Cet 2020-05-22 11:33 ` Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-22 9:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, alan, emacs-devel On Fri, May 22, 2020 at 6:06 AM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Thu, 21 May 2020 21:06:27 +0000 > > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > > But automatic compositions do work by calling the shaper. > > > > Yes, that observation is correct. What I'm doing is still very > > different from the (semi-)automatic compositions composite.c does. > > For ligatures, I don't think I understand why the automatic > compositions are not the way to go. I don't think I've concluded they're not, though I'm strongly leaning that way. I didn't use them in the first patch, but that's probably easy enough to change. (Playing around with composite.c, I noticed it's very easy to get into an unquittable infinite loop by specifying invalid values in composition-function-table. That should probably be fixed). > > > That would prevent Emacs from controlling what is and what isn't > > > composed, leaving the shaper in charge. > > > > Well, yes and no: the shaper is in charge, and I see absolutely > > nothing wrong with that. You can tell the shaper not to perform > > ligatures (or perform only some of them), or kerning, if you want to. > > Tell it how? by introducing new Lisp options and data structures? Yes. A buffer option to disable ligatures/kerning would probably suffice, because it would essentially only be used to work around buggy fonts. > What would those new data structures be, and how will they be > different from composition-function-table? > > > We currently allow Lisp to > > > control that via composition-function-table, which provides a regexp > > > that text around a character must match in order for the matching > > > substring to be passed to the shaper. > > > > And you're suggesting that regexp be set to, say, ".+"? Because that's > > the only way I've found of getting it to do kerning. > > I'm not talking about the kerning. This discussion is about > ligatures, AFAIU. Oh. I understood it differently, because kerning is an important problem to solve in order to use variable-pitch fonts for English text. > For ligatures, the regexp should catch the > sequences of characters that should be ligated. I have to know that before using auto-composition-mode? How do I work it out? Do I have to disassemble the font and reimplement the relevant tables? > ".+" is definitely > not right for ligatures, since it will significantly slow down > redisplay So that's another argument against auto-composition-mode: it's too slow unless you know in advance which ligatures you want. Right? > for no good reason. I think "because I want the ligatures the font provides, and I don't care to work out in advance which ones those are" is a pretty good reason. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-22 9:34 ` Pip Cet @ 2020-05-22 11:33 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-22 11:33 UTC (permalink / raw) To: Pip Cet; +Cc: cpitclaudel, alan, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Fri, 22 May 2020 09:34:54 +0000 > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org > > > > Well, yes and no: the shaper is in charge, and I see absolutely > > > nothing wrong with that. You can tell the shaper not to perform > > > ligatures (or perform only some of them), or kerning, if you want to. > > > > Tell it how? by introducing new Lisp options and data structures? > > Yes. A buffer option to disable ligatures/kerning would probably > suffice, because it would essentially only be used to work around > buggy fonts. That option already exists: disable auto-composition-mode in a buffer where you don't want that. If you want to disable only some compositions, like only ligatures, or only some of the ligatures, you can do that in two ways: . modify composition-function-table (although this currently cannot be done only for a single buffer, I think: something to fix for better ligature support) . provide your own composition function to be used in composition-function-table, which could then be programmed to decide which ligatures to allow and which not to allow > > I'm not talking about the kerning. This discussion is about > > ligatures, AFAIU. > > Oh. I understood it differently, because kerning is an important > problem to solve in order to use variable-pitch fonts for English > text. Perhaps so, but let's discuss the kerning issue separately. It's a separate problem, AFAIU. > > For ligatures, the regexp should catch the > > sequences of characters that should be ligated. > > I have to know that before using auto-composition-mode? How do I work > it out? I tried to answer this in my previous message in this thread. > > ".+" is definitely > > not right for ligatures, since it will significantly slow down > > redisplay > > So that's another argument against auto-composition-mode: it's too > slow unless you know in advance which ligatures you want. Right? It's too slow if we have too many ligatures, or, more generally, too many characters to compose. Character composition works by calling Lisp (so as to allow use the flexibility we need, see the other messages), and calling Lisp for too many characters during redisplay will make redisplay slower. This is one reason why we don't run every buffer substring through the shaper, although the HarfBuzz developers told me long ago they thought this was a flaw in our design. > > for no good reason. > > I think "because I want the ligatures the font provides, and I don't > care to work out in advance which ones those are" is a pretty good > reason. Let's see if I succeeded to convince you that we have better solutions. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-19 13:56 ` Eli Zaretskii 2020-05-19 14:39 ` Clément Pit-Claudel @ 2020-05-19 20:26 ` Alan Third 1 sibling, 0 replies; 145+ messages in thread From: Alan Third @ 2020-05-19 20:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, emacs-devel On Tue, May 19, 2020 at 04:56:32PM +0300, Eli Zaretskii wrote: > > Date: Mon, 18 May 2020 23:59:11 +0200 (CEST) > > From: Alan Third <alan@idiocy.org> > > Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org > > > > In case anyone's interested, I've attached a screenshot of Apple's > > Pages.app displaying the word Zapfino with the cursor after the "a". > > I don't see anything on or after "a", I see a thin vertical line on > the "Z". is that what is actually displayed? If so, how do people > know the cursor is after "a"?? Yep, that's what's displayed. The vertical line is the cursor. The only reason I know it's after the a is because I hit the right arrow twice to get there from the left of the glyph. -- Alan Third ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-18 17:31 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Clément Pit-Claudel 2020-05-18 17:39 ` Eli Zaretskii @ 2020-05-19 10:09 ` Trevor Spiteri 2020-05-19 14:22 ` Eli Zaretskii 1 sibling, 1 reply; 145+ messages in thread From: Trevor Spiteri @ 2020-05-19 10:09 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1506 bytes --] On 18/05/2020 19:31, Clément Pit-Claudel wrote: > On 18/05/2020 12.08, Eli Zaretskii wrote: >> On second thought, I think I misunderstood you. If the font that is >> used shows "ffi" as a _single_ glyph ffi, and LibreOffice indeed >> highlights parts of this glyph, then I'd like to know how it does >> that, and how far does this capability extend. I mean, what does it >> do with ligatures like ae, displayed as æ -- does it highlight the >> common vertical stroke for both parts? And what about "st", displayed >> as st -- this has a curved "hand" connecting s and t -- to which of the >> 2 does it belong for the purposes of highlighting? There's also "hv" >> displayed as ƕ, let alone "fs" displayed as ẞ and "fz" displayed as >> ß. > I've attached a screenshot with a few examples, though I couldn't find a font that displays ae as æ. > > Firefox does the same as LibreOffice (try it here, for example: https://developer.mozilla.org/en-US/docs/Web/CSS/font-variant-ligatures). Since Firefox uses Harbuzz, I think there's a good chance we can support that feature too :) For what it's worth, LibreOffice does it differently. I think what it does is place the cursor on the position it would be if any following text was missing. So moving after the second f in ffi would move the cursor to the same position as after ff if the i was missing. This is evident from fraction ligatures; in the screenshot I'm attaching, "63" is selected and the selection matches the 63 in the bottom line. [-- Attachment #2: fraction.png --] [-- Type: image/png, Size: 5505 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) 2020-05-19 10:09 ` Trevor Spiteri @ 2020-05-19 14:22 ` Eli Zaretskii 0 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 14:22 UTC (permalink / raw) To: Trevor Spiteri; +Cc: emacs-devel > From: Trevor Spiteri <tspiteri@ieee.org> > Date: Tue, 19 May 2020 12:09:32 +0200 > > For what it's worth, LibreOffice does it differently. I think what it > does is place the cursor on the position it would be if any following > text was missing. So moving after the second f in ffi would move the > cursor to the same position as after ff if the i was missing. This is only possible if the metrics of a sole f and f inside the ligature are identical or sufficiently close. That is not generally true in ligatures, not even in Latin ligatures. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-18 16:08 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii ` (2 preceding siblings ...) 2020-05-18 17:31 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Clément Pit-Claudel @ 2020-05-19 5:43 ` ASSI 2020-05-19 7:22 ` Ligatures tomas 2020-05-19 14:18 ` Ligatures Eli Zaretskii 3 siblings, 2 replies; 145+ messages in thread From: ASSI @ 2020-05-19 5:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pipcet, emacs-devel Eli Zaretskii writes: > On second thought, I think I misunderstood you. If the font that is > used shows "ffi" as a _single_ glyph ffi, and LibreOffice indeed > highlights parts of this glyph, then I'd like to know how it does > that, and how far does this capability extend. I mean, what does it > do with ligatures like ae, displayed as æ -- does it highlight the > common vertical stroke for both parts? The only program I ever used that I remember doing this (a WYSIWYG TeX editor for DOS, natch) temporarily broke the ligature while you were moving the cursor inside. It looked a bit strange and was slightly distracting if you were just moving the cursor without trying to edit it, but otherwise did the job well. I expect that fonts that make extensive use of ligatures have information on where the ligatures can be broken and exactly how to display the parts in that case, although I wouldn't be surprised if that information is not very reliable even when just considering latin family scripts. > And what about "st", displayed as st -- this has a curved "hand" > connecting s and t -- to which of the 2 does it belong for the > purposes of highlighting? There's also "hv" displayed as ƕ, let alone > "fs" displayed as ẞ and "fz" displayed as ß. The origin of this ligature has no general consensus AFAIK, but if you read older (facsimile) printed literature from around 1800 it becomes pretty obvious that the typeface evolved from a combination of long s (mainly used inside a word) and round s (used at the end). The origin of "sz" in that place is even more complicated to figure out, but it seems (to me anyway) that this was driven by a desire to preserve the distinction to double s / "ss" when using typefaces that didn't have the proper glyphs for the various types of "s" previously available in Fraktur. Neither "fs" nor "fz" should ligature into "ß" (which is a proper glyph these days and no longer a ligature, although you are still allowed to break it into either "ss" or "sz" when using typefaces that don't support it, like most versalia). Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 5:43 ` Ligatures ASSI @ 2020-05-19 7:22 ` tomas 2020-05-19 7:55 ` Ligatures Joost Kremers 2020-05-19 14:18 ` Ligatures Eli Zaretskii 1 sibling, 1 reply; 145+ messages in thread From: tomas @ 2020-05-19 7:22 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1013 bytes --] On Tue, May 19, 2020 at 07:43:00AM +0200, ASSI wrote: [...] > [...] Neither "fs" nor "fz" should ligature into "ß" (which is a > proper glyph these days and no longer a ligature, although you are still > allowed to break it into either "ss" or "sz" when using typefaces that > don't support it, like most versalia). Definitely. This "long" and "short" vaiants of s were in use in Germany early in te twentieth, in Fraktur and also in handwriting [1]. This two forms of "s" (one for terminal position) still exists in Greek. The ß "ligature" (which isn't perceived as such nowadays) evolved from "ss", the first s being a non-terminal (yeah, looks a bit like an "f" to the untrained eye). In the German speaking part of Switzerland, "ß" is always replaced by "ss". There's no capital version of "ß", you use "SS" (thus breaking bijectivity of upper- and lowercase). Writing is human. Human is messy :-/ Cheers [1] https://de.wikipedia.org/wiki/S%C3%BCtterlinschrift -- tomás [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 7:22 ` Ligatures tomas @ 2020-05-19 7:55 ` Joost Kremers 2020-05-19 8:07 ` Ligatures tomas 0 siblings, 1 reply; 145+ messages in thread From: Joost Kremers @ 2020-05-19 7:55 UTC (permalink / raw) To: tomas; +Cc: emacs-devel On Tue, May 19 2020, tomas@tuxteam.de wrote: > There's no capital version of "ß", you use "SS" (thus breaking > bijectivity of upper- and lowercase). Actually, uppercase ẞ was accepted into the official German spelling in 2017: https://en.wikipedia.org/wiki/Capital_%E1%BA%9E (cf. last line of Section "History"). -- Joost Kremers Life has its moments ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 7:55 ` Ligatures Joost Kremers @ 2020-05-19 8:07 ` tomas 2020-05-19 10:17 ` Ligatures Yuri Khan 2020-05-19 10:43 ` Ligatures Werner LEMBERG 0 siblings, 2 replies; 145+ messages in thread From: tomas @ 2020-05-19 8:07 UTC (permalink / raw) To: Joost Kremers; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1054 bytes --] On Tue, May 19, 2020 at 09:55:25AM +0200, Joost Kremers wrote: > > On Tue, May 19 2020, tomas@tuxteam.de wrote: > >There's no capital version of "ß", you use "SS" (thus breaking > >bijectivity of upper- and lowercase). > > Actually, uppercase ẞ was accepted into the official German spelling > in 2017: > > https://en.wikipedia.org/wiki/Capital_%E1%BA%9E (cf. last line of > Section "History"). Yes, Officially. Nearly nobody uses it. If I had to bet, I'd expect 'ß' to disappear and be replaced by 'ss', as the Swiss do before uppercase ß has a chance :-) But we disgress: I was just trying to highlight how much cultural bias there is in one's view of seemingly technical things. When talking ligatures, one should try to first understand what crazy stuff other languages have to take care of. I wish I could say a thing or two about Devanagari or Hangul [1], but knowledge is just too limited. Cheers [1] https://en.wikipedia.org/wiki/Hangul for another example where you stack stuff in two dimensions -- t [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 8:07 ` Ligatures tomas @ 2020-05-19 10:17 ` Yuri Khan 2020-05-19 14:26 ` Ligatures Eli Zaretskii 2020-05-19 10:43 ` Ligatures Werner LEMBERG 1 sibling, 1 reply; 145+ messages in thread From: Yuri Khan @ 2020-05-19 10:17 UTC (permalink / raw) To: tomas; +Cc: Joost Kremers, Emacs developers On Tue, 19 May 2020 at 15:11, <tomas@tuxteam.de> wrote: > [1] https://en.wikipedia.org/wiki/Hangul > for another example where you stack stuff in two dimensions An example of character combining other than side-by-side stacking is much closer than that: Combining diacritics. Sure, you can delete an acute accent from á by pressing Backspace, but you cannot put point between the ‘a’ and the accent if you want to put a different diacritic between them. (And putting multiple diacritics over a single base character in various orders is a thing, it is the subject of the Unicode Canonical Order subsection in Unicode standard.) ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 10:17 ` Ligatures Yuri Khan @ 2020-05-19 14:26 ` Eli Zaretskii 2020-05-19 19:00 ` Ligatures Yuri Khan 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 14:26 UTC (permalink / raw) To: Yuri Khan; +Cc: joostkremers, tomas, emacs-devel > From: Yuri Khan <yuri.v.khan@gmail.com> > Date: Tue, 19 May 2020 17:17:25 +0700 > Cc: Joost Kremers <joostkremers@fastmail.fm>, > Emacs developers <emacs-devel@gnu.org> > > An example of character combining other than side-by-side stacking is > much closer than that: Combining diacritics. Sure, you can delete an > acute accent from á by pressing Backspace, but you cannot put point > between the ‘a’ and the accent if you want to put a different > diacritic between them. Well, you can (this is Emacs, right?): just disable automatic composition with "M-x auto-composition-mode", and you can do any editing you want. Then re-enable the mode again. > (And putting multiple diacritics over a single base character in > various orders is a thing, it is the subject of the Unicode > Canonical Order subsection in Unicode standard.) Canonical order of diacritics is indeed important for jobs such as comparison, searching, etc. But we are talking about display, and for display there's a requirement that the order should not matter as long as the base character comes first. AFAIR, HarfBuzz supports that requirement, but not every other shaping engine does. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 14:26 ` Ligatures Eli Zaretskii @ 2020-05-19 19:00 ` Yuri Khan 0 siblings, 0 replies; 145+ messages in thread From: Yuri Khan @ 2020-05-19 19:00 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Joost Kremers, tomas, Emacs developers > > (And putting multiple diacritics over a single base character in> > various orders is a thing, it is the subject of the Unicode > > Canonical Order subsection in Unicode standard.) > > Canonical order of diacritics is indeed important for jobs such as > comparison, searching, etc. But we are talking about display, and for > display there's a requirement that the order should not matter as long > as the base character comes first. AFAIR, HarfBuzz supports that > requirement, but not every other shaping engine does. I meant, the Canonical Order spec could be a lot simpler (“just sort all diacritics according to their codepoint value” rather than “take great care to only swap two adjacent diacritics if their combining classes differ and ordered wrongly”) if diacritics order did not matter. But it does; <a> <acute> <diaeresis> is different from <a> <diaeresis> <acute>, so the use case of putting point between the base character and its following diacritic in order to insert a different one is somewhat important. Indeed, toggling auto-composition-mode solves that. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 8:07 ` Ligatures tomas 2020-05-19 10:17 ` Ligatures Yuri Khan @ 2020-05-19 10:43 ` Werner LEMBERG 2020-05-19 10:48 ` Ligatures tomas 1 sibling, 1 reply; 145+ messages in thread From: Werner LEMBERG @ 2020-05-19 10:43 UTC (permalink / raw) To: tomas; +Cc: joostkremers, emacs-devel >> >There's no capital version of "ß", you use "SS" (thus breaking >> >bijectivity of upper- and lowercase). >> >> Actually, uppercase ẞ was accepted into the official German >> spelling in 2017: >> >> https://en.wikipedia.org/wiki/Capital_%E1%BA%9E (cf. last line of >> Section "History"). > > Yes, Officially. Nearly nobody uses it. If I had to bet, I'd expect > 'ß' to disappear and be replaced by 'ss', as the Swiss do before > uppercase ß has a chance :-) Well, if your family name is 'Dreßen', you don't want to see your name written as 'DRESSEN' in your passport (which usually requires uppercase for family names): All German speakers would pronounce the first 'e' as a short vowel instead of the correct long one. Exactly for this situation – and for hardly anything else – you should write 'DREẞEN'. Werner ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 10:43 ` Ligatures Werner LEMBERG @ 2020-05-19 10:48 ` tomas 0 siblings, 0 replies; 145+ messages in thread From: tomas @ 2020-05-19 10:48 UTC (permalink / raw) To: Werner LEMBERG; +Cc: joostkremers, emacs-devel [-- Attachment #1: Type: text/plain, Size: 861 bytes --] On Tue, May 19, 2020 at 12:43:06PM +0200, Werner LEMBERG wrote: [...] > Well, if your family name is 'Dreßen', you don't want to see your name > written as 'DRESSEN' in your passport (which usually requires > uppercase for family names): All German speakers would pronounce the > first 'e' as a short vowel instead of the correct long one. Exactly > for this situation – and for hardly anything else – you should write > 'DREẞEN'. Yes, I know -- that's why such things change slowly. But the Swiss prove that it works. We're used to having things which are written the same and pronounced differently, anyway. One more wouldn't change things. Note that I'm not advocating [1] for dropping the 'ß'. I'm just betting that it might happen rather sooner than later. Cheers [1] I've enough to do advocating free software ;-D -- t [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 5:43 ` Ligatures ASSI 2020-05-19 7:22 ` Ligatures tomas @ 2020-05-19 14:18 ` Eli Zaretskii 2020-05-19 14:52 ` Ligatures Eli Zaretskii 1 sibling, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 14:18 UTC (permalink / raw) To: ASSI; +Cc: pipcet, emacs-devel > From: ASSI <Stromeko@nexgo.de> > Cc: pipcet@gmail.com, emacs-devel@gnu.org > Date: Tue, 19 May 2020 07:43:00 +0200 > > The only program I ever used that I remember doing this (a WYSIWYG TeX > editor for DOS, natch) temporarily broke the ligature while you were > moving the cursor inside. It looked a bit strange and was slightly > distracting if you were just moving the cursor without trying to edit > it, but otherwise did the job well. That's what I had in mind (although I never used such an editor). > The origin of this ligature has no general consensus AFAIK, but if you > read older (facsimile) printed literature from around 1800 it becomes > pretty obvious that the typeface evolved from a combination of long s > (mainly used inside a word) and round s (used at the end). The origin > of "sz" in that place is even more complicated to figure out, but it > seems (to me anyway) that this was driven by a desire to preserve the > distinction to double s / "ss" when using typefaces that didn't have the > proper glyphs for the various types of "s" previously available in > Fraktur. Neither "fs" nor "fz" should ligature into "ß" (which is a > proper glyph these days and no longer a ligature, although you are still > allowed to break it into either "ss" or "sz" when using typefaces that > don't support it, like most versalia). I think we should support these unusual ligatures for those who'd like to see them, probably as an opt-in feature. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 14:18 ` Ligatures Eli Zaretskii @ 2020-05-19 14:52 ` Eli Zaretskii 2020-05-19 15:11 ` Ligatures Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 14:52 UTC (permalink / raw) To: Stromeko; +Cc: pipcet, emacs-devel > Date: Tue, 19 May 2020 17:18:41 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: pipcet@gmail.com, emacs-devel@gnu.org > > > From: ASSI <Stromeko@nexgo.de> > > Cc: pipcet@gmail.com, emacs-devel@gnu.org > > Date: Tue, 19 May 2020 07:43:00 +0200 > > > > The only program I ever used that I remember doing this (a WYSIWYG TeX > > editor for DOS, natch) temporarily broke the ligature while you were > > moving the cursor inside. It looked a bit strange and was slightly > > distracting if you were just moving the cursor without trying to edit > > it, but otherwise did the job well. > > That's what I had in mind (although I never used such an editor). Btw, there's one subtle issue that will need to be resolved if we are to have this feature of "sub-glyph" cursor movement inside composed characters. The way we currently display the default block cursor is by simply redrawing the glyph at point in reverse video. So we don't have a way of displaying a cursor that "covers" only part of a glyph. To make this happen, we'd probably need to draw the cursor as part of drawing the glyph foreground and/or background, which is against the current flow of the display code: we generally first completely draw the background and foreground of the entire text that needs to be redrawn, and only then draw the cursor where it should be placed. Something to figure out by that "Someone" who'd volunteer for the job. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 14:52 ` Ligatures Eli Zaretskii @ 2020-05-19 15:11 ` Pip Cet 2020-05-19 15:36 ` Ligatures Eli Zaretskii 0 siblings, 1 reply; 145+ messages in thread From: Pip Cet @ 2020-05-19 15:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stromeko, emacs-devel On Tue, May 19, 2020 at 2:52 PM Eli Zaretskii <eliz@gnu.org> wrote: > Btw, there's one subtle issue that will need to be resolved if we are > to have this feature of "sub-glyph" cursor movement inside composed > characters. The way we currently display the default block cursor is > by simply redrawing the glyph at point in reverse video. So we don't > have a way of displaying a cursor that "covers" only part of a glyph. I thought that was what glyph_row->clip was for. > To make this happen, we'd probably need to draw the cursor as part of > drawing the glyph foreground and/or background, which is against the I believe that's a change we should make anyway: late cursor drawing makes sense on TTYs with physical cursors, but on GUI backends, we should simply use a special face for drawing the struct glyph a cursor is on, IMHO. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 15:11 ` Ligatures Pip Cet @ 2020-05-19 15:36 ` Eli Zaretskii 2020-05-19 16:16 ` Ligatures Pip Cet 0 siblings, 1 reply; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 15:36 UTC (permalink / raw) To: Pip Cet; +Cc: Stromeko, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Tue, 19 May 2020 15:11:27 +0000 > Cc: Stromeko@nexgo.de, emacs-devel@gnu.org > > On Tue, May 19, 2020 at 2:52 PM Eli Zaretskii <eliz@gnu.org> wrote: > > Btw, there's one subtle issue that will need to be resolved if we are > > to have this feature of "sub-glyph" cursor movement inside composed > > characters. The way we currently display the default block cursor is > > by simply redrawing the glyph at point in reverse video. So we don't > > have a way of displaying a cursor that "covers" only part of a glyph. > > I thought that was what glyph_row->clip was for. We could use that, but that's not the main problem. After all, clipping while drawing is simple and doesn't need any special help. The problem is that we need to change how the cursor is drawn, from the control flow POV. We'd need to audit the code and see that the information required for drawing the cursor is available when we are drawing the text. And then there's the popular use case where nothing changes except the cursor position, in which case no text is redrawn at all. > > To make this happen, we'd probably need to draw the cursor as part of > > drawing the glyph foreground and/or background, which is against the > > I believe that's a change we should make anyway: late cursor drawing > makes sense on TTYs with physical cursors, but on GUI backends, we > should simply use a special face for drawing the struct glyph a cursor > is on, IMHO. It cannot be a single face, because the "thing under cursor" can be anything, and can have different colors. We will need to merge faces, which is slower than the current simple but effective method, which completely sidesteps the issue. in any case, using a face doesn't solve the main problem, as we'd still need to draw the glyph with partial colors. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 15:36 ` Ligatures Eli Zaretskii @ 2020-05-19 16:16 ` Pip Cet 2020-05-19 16:41 ` Ligatures Eli Zaretskii 2020-05-19 17:00 ` Ligatures Eli Zaretskii 0 siblings, 2 replies; 145+ messages in thread From: Pip Cet @ 2020-05-19 16:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stromeko, emacs-devel On Tue, May 19, 2020 at 3:36 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Pip Cet <pipcet@gmail.com> > > Date: Tue, 19 May 2020 15:11:27 +0000 > > Cc: Stromeko@nexgo.de, emacs-devel@gnu.org > > > > On Tue, May 19, 2020 at 2:52 PM Eli Zaretskii <eliz@gnu.org> wrote: > > > Btw, there's one subtle issue that will need to be resolved if we are > > > to have this feature of "sub-glyph" cursor movement inside composed > > > characters. The way we currently display the default block cursor is > > > by simply redrawing the glyph at point in reverse video. So we don't > > > have a way of displaying a cursor that "covers" only part of a glyph. > > > > I thought that was what glyph_row->clip was for. > > We could use that, but that's not the main problem. Sorry, I genuinely don't understand what the problem is. draw_glyphs is called by draw_phys_cursor_glyph, so all we need is a line or two of extra code in draw_phys_cursor_glyphs to set row->clip to the rectangle surrounding the subglyph the cursor is on. No further change of the display engine is required for that, is it? > The problem is that we need to change how the cursor is drawn, from > the control flow POV. That's a separate thing that, yes, we need to do. Because optimizing for TTYs is no longer appropriate. But I don't see why we need to perform this large change before performing the little one that makes things work for subglyphs. > We'd need to audit the code and see that the > information required for drawing the cursor is available when we are > drawing the text. And then there's the popular use case where nothing > changes except the cursor position, in which case no text is redrawn > at all. Except for the glyphs the cursor is on, right? Those are redrawn by draw_phys_cursor_glyph, or am I missing something here? > > > To make this happen, we'd probably need to draw the cursor as part of > > > drawing the glyph foreground and/or background, which is against the > > > > I believe that's a change we should make anyway: late cursor drawing > > makes sense on TTYs with physical cursors, but on GUI backends, we > > should simply use a special face for drawing the struct glyph a cursor > > is on, IMHO. > > It cannot be a single face, because the "thing under cursor" can be > anything, and can have different colors. Agreed. > We will need to merge faces, > which is slower than the current simple but effective method, which > completely sidesteps the issue. I believe performance concerns are an entirely different subject (put briefly, my opinion is that we've painted ourselves into a corner by micro-optimizing fast loops over an essentially inefficient basic design). > in any case, using a face doesn't solve the main problem, as we'd > still need to draw the glyph with partial colors. Which we can do by setting glyph_row->clip? I don't see how there's any problem here at all. Again, I see three totally separate problems here: 1. draw a box cursor over a partial glyph 2. improve the display engine to handle cursor(s) like other highlighting on graphical terminals 3. identify and counteract actual performance problems in the redisplay engine I still don't see how (1) depends on (2), and I think I disagree with you on the subject of (3), because I think we need to fix the design first, moving a lot of C code out to Lisp, then see where things actually chafe and maybe move some special code back to C. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 16:16 ` Ligatures Pip Cet @ 2020-05-19 16:41 ` Eli Zaretskii 2020-05-19 17:00 ` Ligatures Eli Zaretskii 1 sibling, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 16:41 UTC (permalink / raw) To: Pip Cet; +Cc: Stromeko, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Tue, 19 May 2020 16:16:53 +0000 > Cc: Stromeko@nexgo.de, emacs-devel@gnu.org > > I think we need to fix the design first, moving a lot of C code out > to Lisp No, we don't need to fix the design of the display engine. We need to design a new and different display engine, based on ideas more flexible and powerful than the current rectangular array of glyphs. You (or someone else) is more than welcome to work on such a new design, present it here, discuss ideas, etc. If I can help, I will. I will reserve my judgment on the "move to Lisp" part until I see the overall design of this new engine, and at least some of the implementation ideas, including how not to lose existing display features. By contrast, "fixing the design" of the current display engine, let alone moving parts of it to Lisp, is IMNSHO a waste of effort. It simply cannot be fixed, it's already stretched beyond limit. We can (and do) make small adjustments, but that's all. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Ligatures 2020-05-19 16:16 ` Ligatures Pip Cet 2020-05-19 16:41 ` Ligatures Eli Zaretskii @ 2020-05-19 17:00 ` Eli Zaretskii 1 sibling, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-19 17:00 UTC (permalink / raw) To: Pip Cet; +Cc: Stromeko, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Tue, 19 May 2020 16:16:53 +0000 > Cc: Stromeko@nexgo.de, emacs-devel@gnu.org > > Sorry, I genuinely don't understand what the problem is. There's no need to argue. There's a TODO item regarding ligature support, and I just updated it with the ideas from this discussion. You, or anyone else, are welcome to work on some or all of that. I think good ligature support in Emacs is long overdue; that is one of the reasons we added HarfBuzz support and are steadily moving towards making it the default font backend. Any advances in the direction of letting Emacs use advanced features of modern fonts are welcome. > draw_glyphs is called by draw_phys_cursor_glyph, so all we need is a > line or two of extra code in draw_phys_cursor_glyphs to set > row->clip to the rectangle surrounding the subglyph the cursor is > on. No further change of the display engine is required for that, is > it? Feel free to ignore me. I may be completely wrong about this. Please disregard what I said and just code away what you think is needed to implement this. > > And then there's the popular use case where nothing > > changes except the cursor position, in which case no text is redrawn > > at all. > > Except for the glyphs the cursor is on, right? Those are redrawn by > draw_phys_cursor_glyph, or am I missing something here? Basically, yes, draw_phys_cursor_glyph. But there are other functions related to that, and which ones need to be changed for this "partial" cursor drawing to work, I really don't know/remember, sorry. You need to read the code. > > We will need to merge faces, > > which is slower than the current simple but effective method, which > > completely sidesteps the issue. > > I believe performance concerns are an entirely different subject (put > briefly, my opinion is that we've painted ourselves into a corner by > micro-optimizing fast loops over an essentially inefficient basic > design). The current design is that faces are realized lazily and cached for subsequent use, because realizing a face is expensive. It makes no sense to realize a face each time we blink the cursor. No matter what you think about the current design, code which does unnecessary calculations is bad code. Gerd Moellmann, which designed and implemented the current display engine, isn't stupid or incompetent, quite the contrary. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 15:55 ` Eli Zaretskii 2020-05-17 16:28 ` Pip Cet @ 2020-05-17 18:28 ` Julius Pfrommer 2020-05-17 18:45 ` Eli Zaretskii ` (2 more replies) 1 sibling, 3 replies; 145+ messages in thread From: Julius Pfrommer @ 2020-05-17 18:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > First, we need to establish that this is a solution, and for what > problem(s). It is important to realize that the GUI backends we use > handle much more than just drawing text, they need to be able to > display GUI widgets, frame and window decorations (menu bar, tool bar, > scroll bars, the frame title, etc.), and much more. I am quite supportive of the native GUI toolkits. Cairo is a vector-drawing library and only responsible for the "glass" of each frame (called the "canvas" in other communities). All the event-handling logic, menu-drawing, etc. is untouched by it. > Next, please be aware that we already made the decision to use > HarfBuzz as our main text-shaping engine. X and w32 already use it; Very good to see Emacs settle on HarfBuzz! Text-shaping touches into the very core, as the glyph rendering impacts line-breaking, redisplay, and so on. > I don't think the answer will be full and definitive until "Someone" > walks through all the APIs we implement in x/w32/ns/fns.c and > x/w32/ns/term.c, and makes sure they all can be covered. Looking at xterm.c, it is littered with #ifdef USE_CAIRO. A first step could be to assume Cairo on X-based platforms and remove duplicate code. The second step could be to decouple the "glass" from the tookit "chrome" more thoroughly in xterm.c. That is easier to do when a Cairo-canvas can be assumed for drawing. Then, that entire "glass" could be reused by other platforms once they have a Cairo-canvas for drawing as well. (Modulo the XWidget support that depends on GTK.) Once a switchover is in reach, it can live separately to the existing platform-specific "glass" until all the kinks are worked out. Regards, Julius Am Sun, 17 May 2020 18:55:23 +0300 schrieb Eli Zaretskii <eliz@gnu.org>: > > Date: Sun, 17 May 2020 16:59:53 +0200 > > From: Julius Pfrommer <julius.pfrommer@web.de> > > Cc: emacs-devel@gnu.org > > > > Let me phrase the question differently: Would it be okay to have a > > hard dependency on the Cairo+FreeType+Harfbuzz (CFH) libraries, as > > they are available everywhere? > > First, we need to establish that this is a solution, and for what > problem(s). It is important to realize that the GUI backends we use > handle much more than just drawing text, they need to be able to > display GUI widgets, frame and window decorations (menu bar, tool bar, > scroll bars, the frame title, etc.), and much more. Is the > configuration you propose capable of doing all that? I don't think > the answer will be full and definitive until "Someone" walks through > all the APIs we implement in x/w32/ns/fns.c and x/w32/ns/term.c, and > makes sure they all can be covered. > > Next, please be aware that we already made the decision to use > HarfBuzz as our main text-shaping engine. X and w32 already use it; > for NS someone has to write the code (and they are not very likely to > do so because macOS users consider the native text shaping more > feature-rich). Dropping the other font backends is a matter of time, > but it could take a long time. > > In any case, the font backend is not the main issue here; in > particular, the likes of FreeType are hardly even seen except on very > low level of the code. It's the other aspects of GUI code that > bothers me much more. > > > Even on Linux, this would unlock quite a few simplifications. I > > count at least three font handling "backends" here. > > Down to 2 and one deprecated one on master. Bu again, font backends > is a relatively easy problem, and it is being dealt with. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 18:28 ` Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer @ 2020-05-17 18:45 ` Eli Zaretskii 2020-05-17 22:28 ` chad 2020-05-18 22:08 ` Alan Third 2 siblings, 0 replies; 145+ messages in thread From: Eli Zaretskii @ 2020-05-17 18:45 UTC (permalink / raw) To: Julius Pfrommer; +Cc: emacs-devel > Date: Sun, 17 May 2020 20:28:02 +0200 > From: Julius Pfrommer <julius.pfrommer@web.de> > Cc: emacs-devel@gnu.org > > Cairo is a vector-drawing library and only responsible for the "glass" > of each frame (called the "canvas" in other communities). All the > event-handling logic, menu-drawing, etc. is untouched by it. Which is what I said. So Cairo alone will be unable to provide all the GUI features we need, we will need something else. And that something is done different on different platforms. > Looking at xterm.c, it is littered with #ifdef USE_CAIRO. Yes, because Cairo and Xlib are two quite different ways of doing GUI display. > A first step could be to assume Cairo on X-based platforms and remove > duplicate code. We are going there, but it takes time. We've just made Cairo the default build on master; it couldn't be that previously because the Cairo code had several grave bugs which took us time to fix. > The second step could be to decouple the "glass" from > the tookit "chrome" more thoroughly in xterm.c. That is easier to do > when a Cairo-canvas can be assumed for drawing. > > Then, that entire "glass" could be reused by other platforms once they > have a Cairo-canvas for drawing as well. (Modulo the XWidget support > that depends on GTK.) > > Once a switchover is in reach, it can live separately to the existing > platform-specific "glass" until all the kinks are worked out. Sounds like a good plan for several years, maybe more, of extensive development on several platforms. Can I interest you in doing this? And meanwhile, we also need to come up with enough new features every 2 - 3 years to keep our users engaged and attract new ones. ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 18:28 ` Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer 2020-05-17 18:45 ` Eli Zaretskii @ 2020-05-17 22:28 ` chad 2020-05-18 22:08 ` Alan Third 2 siblings, 0 replies; 145+ messages in thread From: chad @ 2020-05-17 22:28 UTC (permalink / raw) To: Julius Pfrommer; +Cc: Eli Zaretskii, EMACS development team [-- Attachment #1: Type: text/plain, Size: 727 bytes --] On Sun, May 17, 2020 at 11:30 AM Julius Pfrommer <julius.pfrommer@web.de> wrote: > A first step could be to assume Cairo on X-based platforms and remove > duplicate code. The second step could be to decouple the "glass" from > the tookit "chrome" more thoroughly in xterm.c. That is easier to do > when a Cairo-canvas can be assumed for drawing. > > Then, that entire "glass" could be reused by other platforms once they > have a Cairo-canvas for drawing as well. (Modulo the XWidget support > that depends on GTK.) > FWIW, there exists code to bring xwidgets and webkit into macOS without GTK: https://github.com/veshboo/emacs I haven't tried it since my macOS machine died ~1.5 years ago, but it worked once upon a time. [-- Attachment #2: Type: text/html, Size: 1210 bytes --] ^ permalink raw reply [flat|nested] 145+ messages in thread
* Re: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) 2020-05-17 18:28 ` Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer 2020-05-17 18:45 ` Eli Zaretskii 2020-05-17 22:28 ` chad @ 2020-05-18 22:08 ` Alan Third 2 siblings, 0 replies; 145+ messages in thread From: Alan Third @ 2020-05-18 22:08 UTC (permalink / raw) To: Julius Pfrommer; +Cc: Eli Zaretskii, emacs-devel On Sun, May 17, 2020 at 08:28:02PM +0200, Julius Pfrommer wrote: > > I don't think the answer will be full and definitive until "Someone" > > walks through all the APIs we implement in x/w32/ns/fns.c and > > x/w32/ns/term.c, and makes sure they all can be covered. > > Looking at xterm.c, it is littered with #ifdef USE_CAIRO. > > A first step could be to assume Cairo on X-based platforms and remove > duplicate code. The second step could be to decouple the "glass" from > the tookit "chrome" more thoroughly in xterm.c. That is easier to do > when a Cairo-canvas can be assumed for drawing. > > Then, that entire "glass" could be reused by other platforms once they > have a Cairo-canvas for drawing as well. (Modulo the XWidget support > that depends on GTK.) > > Once a switchover is in reach, it can live separately to the existing > platform-specific "glass" until all the kinks are worked out. It may be worth your while looking into the PGTK port that some people are working on: https://github.com/masm11/emacs I believe it will be using pure Cairo rendering which may make this project a bit easier. -- Alan Third ^ permalink raw reply [flat|nested] 145+ messages in thread
end of thread, other threads:[~2020-05-27 19:19 UTC | newest] Thread overview: 145+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-05-17 10:41 Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer 2020-05-17 14:09 ` Arthur Miller 2020-05-17 14:30 ` Eli Zaretskii 2020-05-17 15:06 ` Arthur Miller 2020-05-17 15:56 ` Eli Zaretskii 2020-05-17 16:50 ` Arthur Miller 2020-05-17 17:06 ` Eli Zaretskii 2020-05-17 14:35 ` Eli Zaretskii 2020-05-17 14:59 ` Julius Pfrommer 2020-05-17 15:55 ` Eli Zaretskii 2020-05-17 16:28 ` Pip Cet 2020-05-17 17:00 ` Eli Zaretskii 2020-05-17 18:50 ` Pip Cet 2020-05-17 19:17 ` Eli Zaretskii 2020-05-18 16:08 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii 2020-05-18 16:45 ` tomas 2020-05-18 16:49 ` Eli Zaretskii 2020-05-18 17:05 ` Ligatures Stefan Monnier 2020-05-18 17:18 ` Ligatures Eli Zaretskii 2020-05-18 19:19 ` Ligatures Pip Cet 2020-05-18 19:25 ` Ligatures tomas 2020-05-18 19:41 ` Ligatures Pip Cet 2020-05-18 20:20 ` Ligatures tomas 2020-05-18 19:33 ` Ligatures Eli Zaretskii 2020-05-18 19:44 ` Ligatures Clément Pit-Claudel 2020-05-19 2:25 ` Ligatures Eli Zaretskii 2020-05-19 2:44 ` Ligatures Clément Pit-Claudel 2020-05-19 13:59 ` Ligatures Eli Zaretskii 2020-05-19 14:35 ` Ligatures Clément Pit-Claudel 2020-05-19 15:21 ` Ligatures Eli Zaretskii 2020-05-19 15:44 ` Ligatures Clément Pit-Claudel 2020-05-19 16:15 ` Ligatures Eli Zaretskii 2020-05-19 15:36 ` Ligatures Tassilo Horn 2020-05-19 16:08 ` Ligatures Eli Zaretskii 2020-05-19 16:14 ` Ligatures Stefan Monnier 2020-05-19 3:47 ` Ligatures Stefan Monnier 2020-05-19 4:51 ` Ligatures Clément Pit-Claudel 2020-05-18 19:38 ` Ligatures Clément Pit-Claudel 2020-05-19 14:55 ` Ligatures Pip Cet 2020-05-19 15:30 ` Ligatures Clément Pit-Claudel 2020-05-19 15:52 ` Ligatures Pip Cet 2020-05-18 17:24 ` Ligatures tomas 2020-05-18 17:41 ` Ligatures Eli Zaretskii 2020-05-18 19:07 ` Ligatures tomas 2020-05-18 19:17 ` Ligatures Eli Zaretskii 2020-05-18 20:33 ` Ligatures Stefan Monnier 2020-05-18 17:31 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Clément Pit-Claudel 2020-05-18 17:39 ` Eli Zaretskii 2020-05-18 19:01 ` Clément Pit-Claudel 2020-05-18 19:15 ` Eli Zaretskii 2020-05-18 19:18 ` tomas 2020-05-18 20:37 ` Ligatures Stefan Monnier 2020-05-18 21:59 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Alan Third 2020-05-19 13:56 ` Eli Zaretskii 2020-05-19 14:39 ` Clément Pit-Claudel 2020-05-19 21:43 ` Pip Cet 2020-05-20 1:41 ` Clément Pit-Claudel 2020-05-20 2:07 ` Ligatures Stefan Monnier 2020-05-20 7:14 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) tomas 2020-05-20 15:18 ` Eli Zaretskii 2020-05-20 17:31 ` Clément Pit-Claudel 2020-05-20 18:01 ` Eli Zaretskii 2020-05-20 18:33 ` Clément Pit-Claudel 2020-05-20 18:49 ` Eli Zaretskii 2020-05-20 18:53 ` Clément Pit-Claudel 2020-05-20 19:02 ` Eli Zaretskii 2020-05-20 23:19 ` Ligatures Stefan Monnier 2020-05-21 10:01 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Pip Cet 2020-05-21 14:11 ` Eli Zaretskii 2020-05-21 16:26 ` Pip Cet 2020-05-21 19:08 ` Eli Zaretskii 2020-05-21 20:51 ` Clément Pit-Claudel 2020-05-21 21:16 ` Pip Cet 2020-05-22 6:12 ` Eli Zaretskii 2020-05-22 9:25 ` Pip Cet 2020-05-22 11:23 ` Eli Zaretskii 2020-05-22 12:52 ` Pip Cet 2020-05-22 13:15 ` Eli Zaretskii 2020-05-22 13:29 ` Clément Pit-Claudel 2020-05-22 14:30 ` Eli Zaretskii 2020-05-22 14:34 ` Clément Pit-Claudel 2020-05-22 19:01 ` Eli Zaretskii 2020-05-22 19:33 ` Clément Pit-Claudel 2020-05-22 19:44 ` Eli Zaretskii 2020-05-22 20:02 ` Clément Pit-Claudel [not found] ` <83mu5z171j.fsf@gnu.org> 2020-05-23 14:34 ` Clément Pit-Claudel 2020-05-23 16:18 ` Eli Zaretskii 2020-05-23 16:37 ` Clément Pit-Claudel 2020-05-22 13:56 ` Pip Cet [not found] ` <83lflj16jn.fsf@gnu.org> [not found] ` <AF222EA0-FE05-4224-8459-2BF82CE27266@vasilij.de> [not found] ` <834ks7110w.fsf@gnu.org> 2020-05-23 11:24 ` Vasilij Schneidermann 2020-05-23 13:04 ` Eli Zaretskii [not found] ` <83eerb145r.fsf@gnu.org> [not found] ` <CAOqdjBeef8Fa596raEyBUwv0Zr+41LSiYvHW39EdoaXpyxCXVw@mail.gmail.com> [not found] ` <831rnb0zld.fsf@gnu.org> 2020-05-23 12:36 ` Pip Cet 2020-05-23 14:08 ` Eli Zaretskii 2020-05-23 15:13 ` Pip Cet 2020-05-23 16:34 ` Eli Zaretskii 2020-05-23 22:38 ` Pip Cet 2020-05-24 15:33 ` Eli Zaretskii 2020-05-26 18:13 ` Pip Cet 2020-05-26 19:46 ` Eli Zaretskii 2020-05-27 9:36 ` Pip Cet 2020-05-27 17:13 ` Eli Zaretskii 2020-05-27 18:42 ` Pip Cet 2020-05-27 19:19 ` Eli Zaretskii 2020-05-23 17:32 ` Eli Zaretskii 2020-05-23 21:29 ` Pip Cet 2020-05-24 15:19 ` Eli Zaretskii 2020-05-23 12:47 ` Ligatures Stefan Monnier 2020-05-23 13:10 ` Ligatures Eli Zaretskii 2020-05-23 13:45 ` Ligatures Stefan Monnier 2020-05-23 14:12 ` Ligatures Eli Zaretskii 2020-05-23 13:36 ` Ligatures 조성빈 2020-05-23 14:15 ` Ligatures Stefan Monnier 2020-05-23 14:37 ` Ligatures Pip Cet 2020-05-22 11:44 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii 2020-05-22 13:26 ` Clément Pit-Claudel 2020-05-22 14:29 ` Eli Zaretskii 2020-05-22 14:32 ` Clément Pit-Claudel 2020-05-22 19:00 ` Eli Zaretskii 2020-05-21 21:06 ` Pip Cet 2020-05-22 6:06 ` Eli Zaretskii 2020-05-22 9:34 ` Pip Cet 2020-05-22 11:33 ` Eli Zaretskii 2020-05-19 20:26 ` Alan Third 2020-05-19 10:09 ` Trevor Spiteri 2020-05-19 14:22 ` Eli Zaretskii 2020-05-19 5:43 ` Ligatures ASSI 2020-05-19 7:22 ` Ligatures tomas 2020-05-19 7:55 ` Ligatures Joost Kremers 2020-05-19 8:07 ` Ligatures tomas 2020-05-19 10:17 ` Ligatures Yuri Khan 2020-05-19 14:26 ` Ligatures Eli Zaretskii 2020-05-19 19:00 ` Ligatures Yuri Khan 2020-05-19 10:43 ` Ligatures Werner LEMBERG 2020-05-19 10:48 ` Ligatures tomas 2020-05-19 14:18 ` Ligatures Eli Zaretskii 2020-05-19 14:52 ` Ligatures Eli Zaretskii 2020-05-19 15:11 ` Ligatures Pip Cet 2020-05-19 15:36 ` Ligatures Eli Zaretskii 2020-05-19 16:16 ` Ligatures Pip Cet 2020-05-19 16:41 ` Ligatures Eli Zaretskii 2020-05-19 17:00 ` Ligatures Eli Zaretskii 2020-05-17 18:28 ` Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer 2020-05-17 18:45 ` Eli Zaretskii 2020-05-17 22:28 ` chad 2020-05-18 22:08 ` Alan Third
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).