all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Pip Cet <pipcet@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org
Subject: Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY))
Date: Sat, 23 May 2020 15:13:38 +0000	[thread overview]
Message-ID: <CAOqdjBfUCvv2QbxtmqGkYMOh5Rep9WC4mvAWgdGRXm3a_ES9=Q@mail.gmail.com> (raw)
In-Reply-To: <83mu5yzquj.fsf@gnu.org>

On Sat, May 23, 2020 at 2:08 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Sat, 23 May 2020 12:36:56 +0000
> > Cc: cpitclaudel@gmail.com, alan@idiocy.org, emacs-devel@gnu.org
> >
> > > > You write: "(b) is not really feasible without redesigning the entire
> > > > Emacs display engine". I don't see how that's true at all. All we need
> > > > is some limited look-ahead.
> > >
> > > We already have look-ahead: that's what the regexp part of the
> > > composition rules are about.  That is not the crucial problem.
> >
> > But it's the only problem I see!
>
> Then maybe I don't understand what you mean by look-ahead.  Is that
> the decision how to choose those 32 characters of "context"?

Yes.

> Then why
> not use the current regexp-based approach, which is already much
> smarter than just blindly taking a fixed amount of surrounding text?

Because I do not know the regexp to use?

> > When you see an IT_CHARACTER, you get some context, hand it to
> > HarfBuzz, slice up the relevant glyphs, and display them.
>
> The problem is, of course, in the "some context" part.  Your patch
> used an arbitrary 32-character chunk of text around the character to
> shape, which is of course not what the shaping engines want: they want
> _all_ of the surrounding text, the entire paragraph.

Which is clearly too expensive to actually give them, which is
something I didn't think it was necessary to even spell out.

> Your patch also invokes the shaper twice, on the same 32 characters,
> once in encode_char method and again in the text_extents method, which
> is another waste.  The code in composite.c caches the composed
> characters to avoid that, but you bypass it.

Absolutely.

> This is okay for showing the concept, but we cannot use this in
> production.  There are too many arbitrary decisions and inefficient
> expensive operations.

I agree, of course! In fact, the 32-character limit was chosen as a
reminder to myself that things would inherently be inefficient.

> > It doesn't involve composite.c at all, and that's good, because for
> > those tricky special cases composite.c does a better job than standard
> > shaping, and we need to keep that feature. It just shouldn't be the
> > regular route.
>
> Of course, you never tell how to distinguish between the "tricky
> special cases" for which we still need to use composite.c and friends,
> and the other kind.

The tricky special cases get handled as before, and come in with the
iterator .what set to IT_COMPOSITE. The standard cases come in with
.what set to IT_CHARACTER.

> Moreover, the HarfBuzz guys clearly say that what we do now is wrong
> for those "tricky" cases as well, so if we are going to fix that, why
> fix it only for ligatures made out of ASCII characters?

There's no such limitation, but, yes, ideally people would find they
don't need automatic compositions anymore...

> > > The crucial problem is that we currently perform layout decisions one
> > > grapheme cluster at a time, whereas what HarfBuzz people say is that
> > > we should basically do that one screen line at a time.
> >
> > I think we're going to have to compromise: that's why my patch used a
> > 32-character context rather than an entire line or just a single
> > character.
>
> If we are going to compromise, then why not compromise on what we
> already have, which is much less than 32 characters?

0 characters?

> Why should we
> enormously complicate and slow down our code without actually solving
> the problem?

We shouldn't.

> Did you ever see ligatures that are 32-character long?

"Zapfino" is the longest I've seen.

> > Ideally, of course, in most real cases we'd use whitespace-delimited
> > words as chunks. That's mere optimization, though.
>
> That'd be the wrong optimization, AFAIK.

Sure, but since it is exclusively an optimization, it's performance
considerations alone that will decide whether it is.

> E.g., some scripts don't
> have whitespace separated words at all, and still need shaping.

Thus "most".

> And
> what exactly is whitespace for this purpose? e.g., does it include
> Unicode control characters such as ZWJ?

Thankfully, that doesn't matter much: it's just a question of what we
optimize for, not one of what the results will look like.

So I'd say " ", "\t", and "\n" are enough, which is what the display
engine already handles specially.

> > > A secondary (but important) problem is that character composition
> > > involves calls to Lisp, which is relatively slow.  This precludes
> > > calling the shaper for too many characters at once, too many times for
> > > each redisplay cycle of a window.
> >
> > I agree we shouldn't go through Lisp. My patch didn't.
>
> Your patch hard-codes arbitrary numbers without any way to control
> that from Lisp.

Yes.

> Such code will never fly in Emacs.

Of course not.

> > Calling the shaper less often is an important optimization, too. For
> > whitespace-delimited words, we only need to call it once.
>
> This doesn't work when the produced sequence of glyphs doesn't fit on
> the screen line.

> What the current layout code does in this case won't
> work well when you need to break a long sequence of glyphs in the
> middle and then continue on the next line from where you left off on
> this one.

You mean in visual-mode? Because what the current layout code does by
default is to break along any glyph boundary, and I don't see how
that's broken in any way.

> The longer the sequence of glyphs you get from the shaper
> in one go, the higher the probability of hitting this issue.

You break between the glyphs. It doesn't depend on whether you have
two or 20 or 100.

> The bottom line of this is that I think you will find very quickly
> that the basic assumptions of the current design -- that we produce
> single glyphs or very short sequences of them for each call to the
> shaper -- that these assumptions bite you on every step, because the
> code which deals with layout implicitly assumes this.

The shaper interface I described would actually return a single glyph
for each top-level call, with a number of callbacks to provide
context. So that assumption would hold up very well indeed...

> In short, I really don't see how this could ever work, except in a
> very limited set of simple use cases.  E.g., what do you do with
> bidirectional text? ignore it?

A bidi boundary is a hard boundary for HarfBuzz, and no shaping
happens across it. Is that what you mean by "ignore it"?

> > > I don't think there's any disagreements on this high and abstract
> > > level.
> >
> > I think there are: if we treat fonts as programs, we need to let them
> > do their job, which involves kerning, substitutions, ligatures, and
> > even crazy stuff like randomizing the glyph used for each character to
> > get a more hand-written appearance. We don't need to know about
> > ligatures, we just let the font do it. No Lisp callbacks, just a call
> > to harfbuzz.
>
> I think this is a simplistic view of how the display engine works,

Quite possibly :-)

> and
> I don't see how it could work in production while supporting all the
> use cases we already do.

It only comes in for use cases not handled otherwise, i.e. those where
the iterator is at an IT_CHARACTER. All other use cases are
unaffected, because they mean we're overriding the font decision
anyway.

As I said, the problem I have is to get look-ahead working, which you
think isn't a problem. I've got an idea for it, but it doesn't work
(yet); my theory is the bidi.c code fails to keep its state in the
iterator and can't deal with multiple parallel iterators.

> I could be wrong, though, so I'm looking
> forward to see you present a series of patches that do support the
> existing use cases and the ligatures as well, and don't cause any
> slowdown in redisplay.

As I said, what's stopping me is the look-ahead problem, and in
particular some code in bidi.c that doesn't play along well with
look-ahead.



  reply	other threads:[~2020-05-23 15:13 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-17 10:41 Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer
2020-05-17 14:09 ` Arthur Miller
2020-05-17 14:30   ` Eli Zaretskii
2020-05-17 15:06     ` Arthur Miller
2020-05-17 15:56       ` Eli Zaretskii
2020-05-17 16:50         ` Arthur Miller
2020-05-17 17:06           ` Eli Zaretskii
2020-05-17 14:35 ` Eli Zaretskii
2020-05-17 14:59   ` Julius Pfrommer
2020-05-17 15:55     ` Eli Zaretskii
2020-05-17 16:28       ` Pip Cet
2020-05-17 17:00         ` Eli Zaretskii
2020-05-17 18:50           ` Pip Cet
2020-05-17 19:17             ` Eli Zaretskii
2020-05-18 16:08               ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii
2020-05-18 16:45                 ` tomas
2020-05-18 16:49                   ` Eli Zaretskii
2020-05-18 17:05                 ` Ligatures Stefan Monnier
2020-05-18 17:18                   ` Ligatures Eli Zaretskii
2020-05-18 19:19                     ` Ligatures Pip Cet
2020-05-18 19:25                       ` Ligatures tomas
2020-05-18 19:41                         ` Ligatures Pip Cet
2020-05-18 20:20                           ` Ligatures tomas
2020-05-18 19:33                       ` Ligatures Eli Zaretskii
2020-05-18 19:44                         ` Ligatures Clément Pit-Claudel
2020-05-19  2:25                           ` Ligatures Eli Zaretskii
2020-05-19  2:44                             ` Ligatures Clément Pit-Claudel
2020-05-19 13:59                               ` Ligatures Eli Zaretskii
2020-05-19 14:35                                 ` Ligatures Clément Pit-Claudel
2020-05-19 15:21                                   ` Ligatures Eli Zaretskii
2020-05-19 15:44                                     ` Ligatures Clément Pit-Claudel
2020-05-19 16:15                                       ` Ligatures Eli Zaretskii
2020-05-19 15:36                                 ` Ligatures Tassilo Horn
2020-05-19 16:08                                   ` Ligatures Eli Zaretskii
2020-05-19 16:14                                   ` Ligatures Stefan Monnier
2020-05-19  3:47                             ` Ligatures Stefan Monnier
2020-05-19  4:51                               ` Ligatures Clément Pit-Claudel
2020-05-18 19:38                       ` Ligatures Clément Pit-Claudel
2020-05-19 14:55                         ` Ligatures Pip Cet
2020-05-19 15:30                           ` Ligatures Clément Pit-Claudel
2020-05-19 15:52                             ` Ligatures Pip Cet
2020-05-18 17:24                   ` Ligatures tomas
2020-05-18 17:41                     ` Ligatures Eli Zaretskii
2020-05-18 19:07                       ` Ligatures tomas
2020-05-18 19:17                         ` Ligatures Eli Zaretskii
2020-05-18 20:33                     ` Ligatures Stefan Monnier
2020-05-18 17:31                 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Clément Pit-Claudel
2020-05-18 17:39                   ` Eli Zaretskii
2020-05-18 19:01                     ` Clément Pit-Claudel
2020-05-18 19:15                       ` Eli Zaretskii
2020-05-18 19:18                       ` tomas
2020-05-18 20:37                       ` Ligatures Stefan Monnier
2020-05-18 21:59                       ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Alan Third
2020-05-19 13:56                         ` Eli Zaretskii
2020-05-19 14:39                           ` Clément Pit-Claudel
2020-05-19 21:43                             ` Pip Cet
2020-05-20  1:41                               ` Clément Pit-Claudel
2020-05-20  2:07                               ` Ligatures Stefan Monnier
2020-05-20  7:14                               ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) tomas
2020-05-20 15:18                               ` Eli Zaretskii
2020-05-20 17:31                                 ` Clément Pit-Claudel
2020-05-20 18:01                                   ` Eli Zaretskii
2020-05-20 18:33                                     ` Clément Pit-Claudel
2020-05-20 18:49                                       ` Eli Zaretskii
2020-05-20 18:53                                         ` Clément Pit-Claudel
2020-05-20 19:02                                           ` Eli Zaretskii
2020-05-20 23:19                                   ` Ligatures Stefan Monnier
2020-05-21 10:01                                 ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Pip Cet
2020-05-21 14:11                                   ` Eli Zaretskii
2020-05-21 16:26                                     ` Pip Cet
2020-05-21 19:08                                       ` Eli Zaretskii
2020-05-21 20:51                                         ` Clément Pit-Claudel
2020-05-21 21:16                                           ` Pip Cet
2020-05-22  6:12                                             ` Eli Zaretskii
2020-05-22  9:25                                               ` Pip Cet
2020-05-22 11:23                                                 ` Eli Zaretskii
2020-05-22 12:52                                                   ` Pip Cet
2020-05-22 13:15                                                     ` Eli Zaretskii
2020-05-22 13:29                                                       ` Clément Pit-Claudel
2020-05-22 14:30                                                         ` Eli Zaretskii
2020-05-22 14:34                                                           ` Clément Pit-Claudel
2020-05-22 19:01                                                             ` Eli Zaretskii
2020-05-22 19:33                                                               ` Clément Pit-Claudel
2020-05-22 19:44                                                                 ` Eli Zaretskii
2020-05-22 20:02                                                                   ` Clément Pit-Claudel
     [not found]                                                                     ` <83mu5z171j.fsf@gnu.org>
2020-05-23 14:34                                                                       ` Clément Pit-Claudel
2020-05-23 16:18                                                                         ` Eli Zaretskii
2020-05-23 16:37                                                                           ` Clément Pit-Claudel
2020-05-22 13:56                                                       ` Pip Cet
     [not found]                                                         ` <83lflj16jn.fsf@gnu.org>
     [not found]                                                           ` <AF222EA0-FE05-4224-8459-2BF82CE27266@vasilij.de>
     [not found]                                                             ` <834ks7110w.fsf@gnu.org>
2020-05-23 11:24                                                               ` Vasilij Schneidermann
2020-05-23 13:04                                                                 ` Eli Zaretskii
     [not found]                                                           ` <83eerb145r.fsf@gnu.org>
     [not found]                                                             ` <CAOqdjBeef8Fa596raEyBUwv0Zr+41LSiYvHW39EdoaXpyxCXVw@mail.gmail.com>
     [not found]                                                               ` <831rnb0zld.fsf@gnu.org>
2020-05-23 12:36                                                                 ` Pip Cet
2020-05-23 14:08                                                                   ` Eli Zaretskii
2020-05-23 15:13                                                                     ` Pip Cet [this message]
2020-05-23 16:34                                                                       ` Eli Zaretskii
2020-05-23 22:38                                                                         ` Pip Cet
2020-05-24 15:33                                                                           ` Eli Zaretskii
2020-05-26 18:13                                                                             ` Pip Cet
2020-05-26 19:46                                                                               ` Eli Zaretskii
2020-05-27  9:36                                                                                 ` Pip Cet
2020-05-27 17:13                                                                                   ` Eli Zaretskii
2020-05-27 18:42                                                                                     ` Pip Cet
2020-05-27 19:19                                                                                       ` Eli Zaretskii
2020-05-23 17:32                                                                       ` Eli Zaretskii
2020-05-23 21:29                                                                         ` Pip Cet
2020-05-24 15:19                                                                           ` Eli Zaretskii
2020-05-23 12:47                                                                 ` Ligatures Stefan Monnier
2020-05-23 13:10                                                                   ` Ligatures Eli Zaretskii
2020-05-23 13:45                                                                     ` Ligatures Stefan Monnier
2020-05-23 14:12                                                                       ` Ligatures Eli Zaretskii
2020-05-23 13:36                                                                   ` Ligatures 조성빈
2020-05-23 14:15                                                                     ` Ligatures Stefan Monnier
2020-05-23 14:37                                                                   ` Ligatures Pip Cet
2020-05-22 11:44                                           ` Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) Eli Zaretskii
2020-05-22 13:26                                             ` Clément Pit-Claudel
2020-05-22 14:29                                               ` Eli Zaretskii
2020-05-22 14:32                                                 ` Clément Pit-Claudel
2020-05-22 19:00                                                   ` Eli Zaretskii
2020-05-21 21:06                                         ` Pip Cet
2020-05-22  6:06                                           ` Eli Zaretskii
2020-05-22  9:34                                             ` Pip Cet
2020-05-22 11:33                                               ` Eli Zaretskii
2020-05-19 20:26                           ` Alan Third
2020-05-19 10:09                   ` Trevor Spiteri
2020-05-19 14:22                     ` Eli Zaretskii
2020-05-19  5:43                 ` Ligatures ASSI
2020-05-19  7:22                   ` Ligatures tomas
2020-05-19  7:55                     ` Ligatures Joost Kremers
2020-05-19  8:07                       ` Ligatures tomas
2020-05-19 10:17                         ` Ligatures Yuri Khan
2020-05-19 14:26                           ` Ligatures Eli Zaretskii
2020-05-19 19:00                             ` Ligatures Yuri Khan
2020-05-19 10:43                         ` Ligatures Werner LEMBERG
2020-05-19 10:48                           ` Ligatures tomas
2020-05-19 14:18                   ` Ligatures Eli Zaretskii
2020-05-19 14:52                     ` Ligatures Eli Zaretskii
2020-05-19 15:11                       ` Ligatures Pip Cet
2020-05-19 15:36                         ` Ligatures Eli Zaretskii
2020-05-19 16:16                           ` Ligatures Pip Cet
2020-05-19 16:41                             ` Ligatures Eli Zaretskii
2020-05-19 17:00                             ` Ligatures Eli Zaretskii
2020-05-17 18:28       ` Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY) Julius Pfrommer
2020-05-17 18:45         ` Eli Zaretskii
2020-05-17 22:28         ` chad
2020-05-18 22:08         ` Alan Third

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOqdjBfUCvv2QbxtmqGkYMOh5Rep9WC4mvAWgdGRXm3a_ES9=Q@mail.gmail.com' \
    --to=pipcet@gmail.com \
    --cc=alan@idiocy.org \
    --cc=cpitclaudel@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.