From: Eli Zaretskii <eliz@gnu.org>
To: Robert Pluim <rpluim@gmail.com>
Cc: emacs-devel@gnu.org, kevin.legouguec@gmail.com
Subject: Re: Better emoji support
Date: Mon, 20 Sep 2021 22:42:29 +0300 [thread overview]
Message-ID: <835yuv11ay.fsf@gnu.org> (raw)
In-Reply-To: <87pmt3hwoq.fsf@gmail.com> (message from Robert Pluim on Mon, 20 Sep 2021 21:30:13 +0200)
> From: Robert Pluim <rpluim@gmail.com>
> Cc: kevin.legouguec@gmail.com, emacs-devel@gnu.org
> Date: Mon, 20 Sep 2021 21:30:13 +0200
>
> >>>>> On Mon, 20 Sep 2021 21:54:57 +0300, Eli Zaretskii <eliz@gnu.org> said:
>
> Eli> for Emoji sequences in composition-function-table should be anchored
> Eli> on the VS-n codepoints (which I think is a good idea regardless).
> >>
> >> Weʼd have to raise the lookback limit for composition-function-table
> >> rules higher than 3 (maybe only to 4).
>
> Eli> Examples? Not that it's a catastrophe.
>
> >From emoji-zwj-sequences.txt:
>
> 1F468 1F3FB 200D 2764 FE0F 200D 1F468 1F3FB ; RGI_Emoji_ZWJ_Sequence
> ; couple with heart: man, man, light skin tone #
> E13.1 [1] (👨🏻❤️👨🏻)
>
> With the current limit you'd get no further than the 1F3FB if you
> anchored at FE0F, and miss the 1F468.
Ah, that's a misunderstanding. I meant what I said only for sequences
that start with a non-emoji character. When the first character is
from the emoji script, we don't need anything special to have the
right font used.
> >> I guess it reduces the number of entries in
> >> composition-function-table, but then you end up with a lot of rules
> >> for eg VS-16.
>
> Eli> Why do you think we need to have a lot of such rules? What kind of
> Eli> rules did you think about?
>
> For whatever reason, a lot of the sequences in emoji-zwj-sequences.txt
> contain codepoints with Emoji_Presentation = No, hence theyʼre
> followed by VS-16. As a result, anchoring to VS-16 would produces a
> lot of rules for VS-16.
We don't need a separate rule for every sequence, we can use a regular
expression with character sets. We can even have regexps that match
more than emoji-zwj-sequences.txt specifies, since the font and the
shaping engine will sort that out and return a failure indication for
sequences that the font doesn't support.
> Anyway, we can measure the difference, if any, once we have the base
> implementation and Someone™ implements the VS-16 anchored version (it
> would only be a dozen lines of awk, I think).
Let's cross that bridge when we get to it.
next prev parent reply other threads:[~2021-09-20 19:42 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-10 9:19 Better emoji support Eli Zaretskii
2021-07-10 9:24 ` Robert Pluim
2021-07-10 9:37 ` Eli Zaretskii
2021-07-10 9:45 ` Robert Pluim
2021-07-10 9:48 ` Eli Zaretskii
2021-07-12 8:27 ` Robert Pluim
2021-07-12 14:02 ` Robert Pluim
2021-07-12 15:38 ` Robert Pluim
2021-07-12 15:59 ` Eli Zaretskii
2021-07-13 7:00 ` Jean Louis
2021-07-13 7:50 ` Robert Pluim
2021-07-15 15:15 ` Robert Pluim
2021-07-15 15:44 ` Eli Zaretskii
2021-09-13 12:36 ` Eli Zaretskii
2021-09-13 15:44 ` Robert Pluim
2021-09-13 16:07 ` Eli Zaretskii
2021-09-13 16:36 ` Jean-Christophe Helary
2021-09-17 13:01 ` Robert Pluim
2021-09-17 13:42 ` Jean-Christophe Helary
2021-09-17 13:50 ` Robert Pluim
2021-09-17 15:51 ` Kévin Le Gouguec
2021-09-17 16:43 ` Robert Pluim
2021-09-17 19:34 ` Kévin Le Gouguec
2021-09-19 18:27 ` Robert Pluim
2021-09-19 19:43 ` Kévin Le Gouguec
2021-09-20 4:23 ` Eli Zaretskii
2021-09-20 6:20 ` Kévin Le Gouguec
2021-09-20 6:53 ` Eli Zaretskii
2021-09-20 8:40 ` Robert Pluim
2021-09-20 9:53 ` Eli Zaretskii
2021-09-20 13:03 ` Robert Pluim
2021-09-20 13:15 ` Eli Zaretskii
2021-09-20 13:25 ` Eli Zaretskii
2021-09-20 13:50 ` Robert Pluim
2021-09-20 15:27 ` Eli Zaretskii
2021-09-20 17:32 ` Robert Pluim
2021-09-20 18:54 ` Eli Zaretskii
2021-09-20 19:30 ` Robert Pluim
2021-09-20 19:42 ` Eli Zaretskii [this message]
2021-09-20 20:05 ` Robert Pluim
2021-09-20 13:40 ` Robert Pluim
2021-09-20 13:45 ` Eli Zaretskii
2021-09-17 15:58 ` Jean-Christophe Helary
2021-09-17 16:35 ` Daniel Martín
2021-09-17 16:52 ` Robert Pluim
2021-09-17 15:39 ` Stephen Berman
2021-09-17 16:01 ` Eli Zaretskii
2021-09-17 16:34 ` Stephen Berman
2021-09-17 16:58 ` Robert Pluim
2021-09-17 17:05 ` Stephen Berman
2021-09-17 17:10 ` Robert Pluim
2021-09-17 17:17 ` Stephen Berman
2021-09-17 17:37 ` Better emoji support, " Robert Pluim
2021-09-17 17:49 ` Stephen Berman
2021-09-17 17:51 ` Robert Pluim
2021-09-17 18:53 ` martin rudalics
2021-09-17 18:58 ` Robert Pluim
2021-09-17 19:44 ` Stephen Berman
2021-09-17 18:53 ` martin rudalics
2021-09-17 16:55 ` martin rudalics
2021-09-17 18:48 ` Eli Zaretskii
2021-09-17 18:59 ` Robert Pluim
2021-09-18 5:39 ` Eli Zaretskii
2021-09-18 6:25 ` Eli Zaretskii
2021-09-19 16:09 ` Juri Linkov
2021-09-19 17:16 ` Kévin Le Gouguec
2021-09-19 18:20 ` Robert Pluim
2021-09-19 19:13 ` Robert Pluim
2021-09-19 17:24 ` Eli Zaretskii
2021-09-19 18:10 ` Robert Pluim
2021-09-19 18:29 ` Eli Zaretskii
2021-09-19 18:40 ` Robert Pluim
2021-09-19 18:34 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=835yuv11ay.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=kevin.legouguec@gmail.com \
--cc=rpluim@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).