unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
@ 2023-05-26  3:18 Steven Allen
  2023-05-26  6:41 ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Steven Allen @ 2023-05-26  3:18 UTC (permalink / raw)
  To: 63731

[-- Attachment #1: Type: text/plain, Size: 639 bytes --]


This patch imports the full list from unicode.org instead of
special-casing a few characters as was done previously.

With this patch, '👍️' (1F44D FE0F) should look the same as '👍' (1F44D).
Without it, it will look like '👍‌️'.

As a simple regression test, '✔' (2714) should still as "text" while '✔️'
(2714 FE0F) should still display as an emoji.

Fixes https://github.com/alphapapa/ement.el/issues/137

NOTE: I'm not a Unicode expert, nor do I understand how Emacs handles
Unicode (beyond what was required to implement this patch). But this
patch appears to work and I can't find any regressions.


[-- Attachment #2: 0001-Support-Emoji-Variation-Sequence-16-FE0F-where-appro.patch --]
[-- Type: text/x-patch, Size: 42713 bytes --]

From 7049a28f200739260f567f27b2a68b6a040b4a89 Mon Sep 17 00:00:00 2001
From: Steven Allen <steven@stebalien.com>
Date: Thu, 25 May 2023 18:30:14 -0700
Subject: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Import the full list from unicode.org instead of special-casing a few
characters

With this patch, '👍️' (1F44D FE0F) should look the same as
'👍' (1F44D). Without it, it will look like '👍‌️'.

* admin/unidata/emoji-variation-sequences.txt: import the variation
sequences from unicode.org.
* admin/unidata/README: document the new file.
* admin/unidata/Makefile.in:
* admin/unidata/emoji-zwj.awk: parse FE0F sequences from
emoji-variation-sequences.txt
---
 admin/unidata/Makefile.in                   |   2 +-
 admin/unidata/README                        |   4 +
 admin/unidata/emoji-variation-sequences.txt | 723 ++++++++++++++++++++
 admin/unidata/emoji-zwj.awk                 |  33 +-
 4 files changed, 739 insertions(+), 23 deletions(-)
 create mode 100644 admin/unidata/emoji-variation-sequences.txt

diff --git a/admin/unidata/Makefile.in b/admin/unidata/Makefile.in
index cccd85213f1..7c9482504ab 100644
--- a/admin/unidata/Makefile.in
+++ b/admin/unidata/Makefile.in
@@ -116,7 +116,7 @@ emoji-zwj.el: ${unidir}/emoji-zwj.el
 
 zwj = ${srcdir}/emoji-zwj.awk
 
-${unidir}/emoji-zwj.el: ${srcdir}/emoji-zwj-sequences.txt $(srcdir)/emoji-sequences.txt ${zwj}
+${unidir}/emoji-zwj.el: ${srcdir}/emoji-zwj-sequences.txt ${srcdir}/emoji-sequences.txt ${srcdir}/emoji-variation-sequences.txt ${zwj}
 	$(AM_V_GEN)$(AWK) -f ${zwj} $^ > $@
 
 .PHONY: clean bootstrap-clean distclean maintainer-clean gen-clean
diff --git a/admin/unidata/README b/admin/unidata/README
index 2d421dfb6bf..a9743e18bea 100644
--- a/admin/unidata/README
+++ b/admin/unidata/README
@@ -64,3 +64,7 @@ https://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
 IdnaMappingTable.txt
 https://www.unicode.org/Public/idna/latest/IdnaMappingTable.txt
 2022-01-18
+
+emoji-variation-sequences.txt
+https://www.unicode.org/Public/14.0.0/ucd/emoji/emoji-variation-sequences.txt
+2023-05-25
\ No newline at end of file
diff --git a/admin/unidata/emoji-variation-sequences.txt b/admin/unidata/emoji-variation-sequences.txt
new file mode 100644
index 00000000000..942377bf264
--- /dev/null
+++ b/admin/unidata/emoji-variation-sequences.txt
@@ -0,0 +1,723 @@
+# emoji-variation-sequences-14.0.0.txt
+# Date: 2021-06-08, 05:19:16 GMT
+# © 2021 Unicode®, Inc.
+# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
+# For terms of use, see http://www.unicode.org/terms_of_use.html
+#
+# Emoji Variation Sequences for UTS #51
+# Used with Emoji Version 14.0 and subsequent minor revisions (if any)
+#
+# For documentation and usage, see http://www.unicode.org/reports/tr51
+#
+0023 FE0E  ; text style;  # (1.1) NUMBER SIGN
+0023 FE0F  ; emoji style; # (1.1) NUMBER SIGN
+002A FE0E  ; text style;  # (1.1) ASTERISK
+002A FE0F  ; emoji style; # (1.1) ASTERISK
+0030 FE0E  ; text style;  # (1.1) DIGIT ZERO
+0030 FE0F  ; emoji style; # (1.1) DIGIT ZERO
+0031 FE0E  ; text style;  # (1.1) DIGIT ONE
+0031 FE0F  ; emoji style; # (1.1) DIGIT ONE
+0032 FE0E  ; text style;  # (1.1) DIGIT TWO
+0032 FE0F  ; emoji style; # (1.1) DIGIT TWO
+0033 FE0E  ; text style;  # (1.1) DIGIT THREE
+0033 FE0F  ; emoji style; # (1.1) DIGIT THREE
+0034 FE0E  ; text style;  # (1.1) DIGIT FOUR
+0034 FE0F  ; emoji style; # (1.1) DIGIT FOUR
+0035 FE0E  ; text style;  # (1.1) DIGIT FIVE
+0035 FE0F  ; emoji style; # (1.1) DIGIT FIVE
+0036 FE0E  ; text style;  # (1.1) DIGIT SIX
+0036 FE0F  ; emoji style; # (1.1) DIGIT SIX
+0037 FE0E  ; text style;  # (1.1) DIGIT SEVEN
+0037 FE0F  ; emoji style; # (1.1) DIGIT SEVEN
+0038 FE0E  ; text style;  # (1.1) DIGIT EIGHT
+0038 FE0F  ; emoji style; # (1.1) DIGIT EIGHT
+0039 FE0E  ; text style;  # (1.1) DIGIT NINE
+0039 FE0F  ; emoji style; # (1.1) DIGIT NINE
+00A9 FE0E  ; text style;  # (1.1) COPYRIGHT SIGN
+00A9 FE0F  ; emoji style; # (1.1) COPYRIGHT SIGN
+00AE FE0E  ; text style;  # (1.1) REGISTERED SIGN
+00AE FE0F  ; emoji style; # (1.1) REGISTERED SIGN
+203C FE0E  ; text style;  # (1.1) DOUBLE EXCLAMATION MARK
+203C FE0F  ; emoji style; # (1.1) DOUBLE EXCLAMATION MARK
+2049 FE0E  ; text style;  # (3.0) EXCLAMATION QUESTION MARK
+2049 FE0F  ; emoji style; # (3.0) EXCLAMATION QUESTION MARK
+2122 FE0E  ; text style;  # (1.1) TRADE MARK SIGN
+2122 FE0F  ; emoji style; # (1.1) TRADE MARK SIGN
+2139 FE0E  ; text style;  # (3.0) INFORMATION SOURCE
+2139 FE0F  ; emoji style; # (3.0) INFORMATION SOURCE
+2194 FE0E  ; text style;  # (1.1) LEFT RIGHT ARROW
+2194 FE0F  ; emoji style; # (1.1) LEFT RIGHT ARROW
+2195 FE0E  ; text style;  # (1.1) UP DOWN ARROW
+2195 FE0F  ; emoji style; # (1.1) UP DOWN ARROW
+2196 FE0E  ; text style;  # (1.1) NORTH WEST ARROW
+2196 FE0F  ; emoji style; # (1.1) NORTH WEST ARROW
+2197 FE0E  ; text style;  # (1.1) NORTH EAST ARROW
+2197 FE0F  ; emoji style; # (1.1) NORTH EAST ARROW
+2198 FE0E  ; text style;  # (1.1) SOUTH EAST ARROW
+2198 FE0F  ; emoji style; # (1.1) SOUTH EAST ARROW
+2199 FE0E  ; text style;  # (1.1) SOUTH WEST ARROW
+2199 FE0F  ; emoji style; # (1.1) SOUTH WEST ARROW
+21A9 FE0E  ; text style;  # (1.1) LEFTWARDS ARROW WITH HOOK
+21A9 FE0F  ; emoji style; # (1.1) LEFTWARDS ARROW WITH HOOK
+21AA FE0E  ; text style;  # (1.1) RIGHTWARDS ARROW WITH HOOK
+21AA FE0F  ; emoji style; # (1.1) RIGHTWARDS ARROW WITH HOOK
+231A FE0E  ; text style;  # (1.1) WATCH
+231A FE0F  ; emoji style; # (1.1) WATCH
+231B FE0E  ; text style;  # (1.1) HOURGLASS
+231B FE0F  ; emoji style; # (1.1) HOURGLASS
+2328 FE0E  ; text style;  # (1.1) KEYBOARD
+2328 FE0F  ; emoji style; # (1.1) KEYBOARD
+23CF FE0E  ; text style;  # (4.0) EJECT SYMBOL
+23CF FE0F  ; emoji style; # (4.0) EJECT SYMBOL
+23E9 FE0E  ; text style;  # (6.0) BLACK RIGHT-POINTING DOUBLE TRIANGLE
+23E9 FE0F  ; emoji style; # (6.0) BLACK RIGHT-POINTING DOUBLE TRIANGLE
+23EA FE0E  ; text style;  # (6.0) BLACK LEFT-POINTING DOUBLE TRIANGLE
+23EA FE0F  ; emoji style; # (6.0) BLACK LEFT-POINTING DOUBLE TRIANGLE
+23ED FE0E  ; text style;  # (6.0) BLACK RIGHT-POINTING DOUBLE TRIANGLE WITH VERTICAL BAR
+23ED FE0F  ; emoji style; # (6.0) BLACK RIGHT-POINTING DOUBLE TRIANGLE WITH VERTICAL BAR
+23EE FE0E  ; text style;  # (6.0) BLACK LEFT-POINTING DOUBLE TRIANGLE WITH VERTICAL BAR
+23EE FE0F  ; emoji style; # (6.0) BLACK LEFT-POINTING DOUBLE TRIANGLE WITH VERTICAL BAR
+23EF FE0E  ; text style;  # (6.0) BLACK RIGHT-POINTING TRIANGLE WITH DOUBLE VERTICAL BAR
+23EF FE0F  ; emoji style; # (6.0) BLACK RIGHT-POINTING TRIANGLE WITH DOUBLE VERTICAL BAR
+23F1 FE0E  ; text style;  # (6.0) STOPWATCH
+23F1 FE0F  ; emoji style; # (6.0) STOPWATCH
+23F2 FE0E  ; text style;  # (6.0) TIMER CLOCK
+23F2 FE0F  ; emoji style; # (6.0) TIMER CLOCK
+23F3 FE0E  ; text style;  # (6.0) HOURGLASS WITH FLOWING SAND
+23F3 FE0F  ; emoji style; # (6.0) HOURGLASS WITH FLOWING SAND
+23F8 FE0E  ; text style;  # (7.0) DOUBLE VERTICAL BAR
+23F8 FE0F  ; emoji style; # (7.0) DOUBLE VERTICAL BAR
+23F9 FE0E  ; text style;  # (7.0) BLACK SQUARE FOR STOP
+23F9 FE0F  ; emoji style; # (7.0) BLACK SQUARE FOR STOP
+23FA FE0E  ; text style;  # (7.0) BLACK CIRCLE FOR RECORD
+23FA FE0F  ; emoji style; # (7.0) BLACK CIRCLE FOR RECORD
+24C2 FE0E  ; text style;  # (1.1) CIRCLED LATIN CAPITAL LETTER M
+24C2 FE0F  ; emoji style; # (1.1) CIRCLED LATIN CAPITAL LETTER M
+25AA FE0E  ; text style;  # (1.1) BLACK SMALL SQUARE
+25AA FE0F  ; emoji style; # (1.1) BLACK SMALL SQUARE
+25AB FE0E  ; text style;  # (1.1) WHITE SMALL SQUARE
+25AB FE0F  ; emoji style; # (1.1) WHITE SMALL SQUARE
+25B6 FE0E  ; text style;  # (1.1) BLACK RIGHT-POINTING TRIANGLE
+25B6 FE0F  ; emoji style; # (1.1) BLACK RIGHT-POINTING TRIANGLE
+25C0 FE0E  ; text style;  # (1.1) BLACK LEFT-POINTING TRIANGLE
+25C0 FE0F  ; emoji style; # (1.1) BLACK LEFT-POINTING TRIANGLE
+25FB FE0E  ; text style;  # (3.2) WHITE MEDIUM SQUARE
+25FB FE0F  ; emoji style; # (3.2) WHITE MEDIUM SQUARE
+25FC FE0E  ; text style;  # (3.2) BLACK MEDIUM SQUARE
+25FC FE0F  ; emoji style; # (3.2) BLACK MEDIUM SQUARE
+25FD FE0E  ; text style;  # (3.2) WHITE MEDIUM SMALL SQUARE
+25FD FE0F  ; emoji style; # (3.2) WHITE MEDIUM SMALL SQUARE
+25FE FE0E  ; text style;  # (3.2) BLACK MEDIUM SMALL SQUARE
+25FE FE0F  ; emoji style; # (3.2) BLACK MEDIUM SMALL SQUARE
+2600 FE0E  ; text style;  # (1.1) BLACK SUN WITH RAYS
+2600 FE0F  ; emoji style; # (1.1) BLACK SUN WITH RAYS
+2601 FE0E  ; text style;  # (1.1) CLOUD
+2601 FE0F  ; emoji style; # (1.1) CLOUD
+2602 FE0E  ; text style;  # (1.1) UMBRELLA
+2602 FE0F  ; emoji style; # (1.1) UMBRELLA
+2603 FE0E  ; text style;  # (1.1) SNOWMAN
+2603 FE0F  ; emoji style; # (1.1) SNOWMAN
+2604 FE0E  ; text style;  # (1.1) COMET
+2604 FE0F  ; emoji style; # (1.1) COMET
+260E FE0E  ; text style;  # (1.1) BLACK TELEPHONE
+260E FE0F  ; emoji style; # (1.1) BLACK TELEPHONE
+2611 FE0E  ; text style;  # (1.1) BALLOT BOX WITH CHECK
+2611 FE0F  ; emoji style; # (1.1) BALLOT BOX WITH CHECK
+2614 FE0E  ; text style;  # (4.0) UMBRELLA WITH RAIN DROPS
+2614 FE0F  ; emoji style; # (4.0) UMBRELLA WITH RAIN DROPS
+2615 FE0E  ; text style;  # (4.0) HOT BEVERAGE
+2615 FE0F  ; emoji style; # (4.0) HOT BEVERAGE
+2618 FE0E  ; text style;  # (4.1) SHAMROCK
+2618 FE0F  ; emoji style; # (4.1) SHAMROCK
+261D FE0E  ; text style;  # (1.1) WHITE UP POINTING INDEX
+261D FE0F  ; emoji style; # (1.1) WHITE UP POINTING INDEX
+2620 FE0E  ; text style;  # (1.1) SKULL AND CROSSBONES
+2620 FE0F  ; emoji style; # (1.1) SKULL AND CROSSBONES
+2622 FE0E  ; text style;  # (1.1) RADIOACTIVE SIGN
+2622 FE0F  ; emoji style; # (1.1) RADIOACTIVE SIGN
+2623 FE0E  ; text style;  # (1.1) BIOHAZARD SIGN
+2623 FE0F  ; emoji style; # (1.1) BIOHAZARD SIGN
+2626 FE0E  ; text style;  # (1.1) ORTHODOX CROSS
+2626 FE0F  ; emoji style; # (1.1) ORTHODOX CROSS
+262A FE0E  ; text style;  # (1.1) STAR AND CRESCENT
+262A FE0F  ; emoji style; # (1.1) STAR AND CRESCENT
+262E FE0E  ; text style;  # (1.1) PEACE SYMBOL
+262E FE0F  ; emoji style; # (1.1) PEACE SYMBOL
+262F FE0E  ; text style;  # (1.1) YIN YANG
+262F FE0F  ; emoji style; # (1.1) YIN YANG
+2638 FE0E  ; text style;  # (1.1) WHEEL OF DHARMA
+2638 FE0F  ; emoji style; # (1.1) WHEEL OF DHARMA
+2639 FE0E  ; text style;  # (1.1) WHITE FROWNING FACE
+2639 FE0F  ; emoji style; # (1.1) WHITE FROWNING FACE
+263A FE0E  ; text style;  # (1.1) WHITE SMILING FACE
+263A FE0F  ; emoji style; # (1.1) WHITE SMILING FACE
+2640 FE0E  ; text style;  # (1.1) FEMALE SIGN
+2640 FE0F  ; emoji style; # (1.1) FEMALE SIGN
+2642 FE0E  ; text style;  # (1.1) MALE SIGN
+2642 FE0F  ; emoji style; # (1.1) MALE SIGN
+2648 FE0E  ; text style;  # (1.1) ARIES
+2648 FE0F  ; emoji style; # (1.1) ARIES
+2649 FE0E  ; text style;  # (1.1) TAURUS
+2649 FE0F  ; emoji style; # (1.1) TAURUS
+264A FE0E  ; text style;  # (1.1) GEMINI
+264A FE0F  ; emoji style; # (1.1) GEMINI
+264B FE0E  ; text style;  # (1.1) CANCER
+264B FE0F  ; emoji style; # (1.1) CANCER
+264C FE0E  ; text style;  # (1.1) LEO
+264C FE0F  ; emoji style; # (1.1) LEO
+264D FE0E  ; text style;  # (1.1) VIRGO
+264D FE0F  ; emoji style; # (1.1) VIRGO
+264E FE0E  ; text style;  # (1.1) LIBRA
+264E FE0F  ; emoji style; # (1.1) LIBRA
+264F FE0E  ; text style;  # (1.1) SCORPIUS
+264F FE0F  ; emoji style; # (1.1) SCORPIUS
+2650 FE0E  ; text style;  # (1.1) SAGITTARIUS
+2650 FE0F  ; emoji style; # (1.1) SAGITTARIUS
+2651 FE0E  ; text style;  # (1.1) CAPRICORN
+2651 FE0F  ; emoji style; # (1.1) CAPRICORN
+2652 FE0E  ; text style;  # (1.1) AQUARIUS
+2652 FE0F  ; emoji style; # (1.1) AQUARIUS
+2653 FE0E  ; text style;  # (1.1) PISCES
+2653 FE0F  ; emoji style; # (1.1) PISCES
+265F FE0E  ; text style;  # (1.1) BLACK CHESS PAWN
+265F FE0F  ; emoji style; # (1.1) BLACK CHESS PAWN
+2660 FE0E  ; text style;  # (1.1) BLACK SPADE SUIT
+2660 FE0F  ; emoji style; # (1.1) BLACK SPADE SUIT
+2663 FE0E  ; text style;  # (1.1) BLACK CLUB SUIT
+2663 FE0F  ; emoji style; # (1.1) BLACK CLUB SUIT
+2665 FE0E  ; text style;  # (1.1) BLACK HEART SUIT
+2665 FE0F  ; emoji style; # (1.1) BLACK HEART SUIT
+2666 FE0E  ; text style;  # (1.1) BLACK DIAMOND SUIT
+2666 FE0F  ; emoji style; # (1.1) BLACK DIAMOND SUIT
+2668 FE0E  ; text style;  # (1.1) HOT SPRINGS
+2668 FE0F  ; emoji style; # (1.1) HOT SPRINGS
+267B FE0E  ; text style;  # (3.2) BLACK UNIVERSAL RECYCLING SYMBOL
+267B FE0F  ; emoji style; # (3.2) BLACK UNIVERSAL RECYCLING SYMBOL
+267E FE0E  ; text style;  # (4.1) PERMANENT PAPER SIGN
+267E FE0F  ; emoji style; # (4.1) PERMANENT PAPER SIGN
+267F FE0E  ; text style;  # (4.1) WHEELCHAIR SYMBOL
+267F FE0F  ; emoji style; # (4.1) WHEELCHAIR SYMBOL
+2692 FE0E  ; text style;  # (4.1) HAMMER AND PICK
+2692 FE0F  ; emoji style; # (4.1) HAMMER AND PICK
+2693 FE0E  ; text style;  # (4.1) ANCHOR
+2693 FE0F  ; emoji style; # (4.1) ANCHOR
+2694 FE0E  ; text style;  # (4.1) CROSSED SWORDS
+2694 FE0F  ; emoji style; # (4.1) CROSSED SWORDS
+2695 FE0E  ; text style;  # (4.1) STAFF OF AESCULAPIUS
+2695 FE0F  ; emoji style; # (4.1) STAFF OF AESCULAPIUS
+2696 FE0E  ; text style;  # (4.1) SCALES
+2696 FE0F  ; emoji style; # (4.1) SCALES
+2697 FE0E  ; text style;  # (4.1) ALEMBIC
+2697 FE0F  ; emoji style; # (4.1) ALEMBIC
+2699 FE0E  ; text style;  # (4.1) GEAR
+2699 FE0F  ; emoji style; # (4.1) GEAR
+269B FE0E  ; text style;  # (4.1) ATOM SYMBOL
+269B FE0F  ; emoji style; # (4.1) ATOM SYMBOL
+269C FE0E  ; text style;  # (4.1) FLEUR-DE-LIS
+269C FE0F  ; emoji style; # (4.1) FLEUR-DE-LIS
+26A0 FE0E  ; text style;  # (4.0) WARNING SIGN
+26A0 FE0F  ; emoji style; # (4.0) WARNING SIGN
+26A1 FE0E  ; text style;  # (4.0) HIGH VOLTAGE SIGN
+26A1 FE0F  ; emoji style; # (4.0) HIGH VOLTAGE SIGN
+26A7 FE0E  ; text style;  # (4.1) MALE WITH STROKE AND MALE AND FEMALE SIGN
+26A7 FE0F  ; emoji style; # (4.1) MALE WITH STROKE AND MALE AND FEMALE SIGN
+26AA FE0E  ; text style;  # (4.1) MEDIUM WHITE CIRCLE
+26AA FE0F  ; emoji style; # (4.1) MEDIUM WHITE CIRCLE
+26AB FE0E  ; text style;  # (4.1) MEDIUM BLACK CIRCLE
+26AB FE0F  ; emoji style; # (4.1) MEDIUM BLACK CIRCLE
+26B0 FE0E  ; text style;  # (4.1) COFFIN
+26B0 FE0F  ; emoji style; # (4.1) COFFIN
+26B1 FE0E  ; text style;  # (4.1) FUNERAL URN
+26B1 FE0F  ; emoji style; # (4.1) FUNERAL URN
+26BD FE0E  ; text style;  # (5.2) SOCCER BALL
+26BD FE0F  ; emoji style; # (5.2) SOCCER BALL
+26BE FE0E  ; text style;  # (5.2) BASEBALL
+26BE FE0F  ; emoji style; # (5.2) BASEBALL
+26C4 FE0E  ; text style;  # (5.2) SNOWMAN WITHOUT SNOW
+26C4 FE0F  ; emoji style; # (5.2) SNOWMAN WITHOUT SNOW
+26C5 FE0E  ; text style;  # (5.2) SUN BEHIND CLOUD
+26C5 FE0F  ; emoji style; # (5.2) SUN BEHIND CLOUD
+26C8 FE0E  ; text style;  # (5.2) THUNDER CLOUD AND RAIN
+26C8 FE0F  ; emoji style; # (5.2) THUNDER CLOUD AND RAIN
+26CF FE0E  ; text style;  # (5.2) PICK
+26CF FE0F  ; emoji style; # (5.2) PICK
+26D1 FE0E  ; text style;  # (5.2) HELMET WITH WHITE CROSS
+26D1 FE0F  ; emoji style; # (5.2) HELMET WITH WHITE CROSS
+26D3 FE0E  ; text style;  # (5.2) CHAINS
+26D3 FE0F  ; emoji style; # (5.2) CHAINS
+26D4 FE0E  ; text style;  # (5.2) NO ENTRY
+26D4 FE0F  ; emoji style; # (5.2) NO ENTRY
+26E9 FE0E  ; text style;  # (5.2) SHINTO SHRINE
+26E9 FE0F  ; emoji style; # (5.2) SHINTO SHRINE
+26EA FE0E  ; text style;  # (5.2) CHURCH
+26EA FE0F  ; emoji style; # (5.2) CHURCH
+26F0 FE0E  ; text style;  # (5.2) MOUNTAIN
+26F0 FE0F  ; emoji style; # (5.2) MOUNTAIN
+26F1 FE0E  ; text style;  # (5.2) UMBRELLA ON GROUND
+26F1 FE0F  ; emoji style; # (5.2) UMBRELLA ON GROUND
+26F2 FE0E  ; text style;  # (5.2) FOUNTAIN
+26F2 FE0F  ; emoji style; # (5.2) FOUNTAIN
+26F3 FE0E  ; text style;  # (5.2) FLAG IN HOLE
+26F3 FE0F  ; emoji style; # (5.2) FLAG IN HOLE
+26F4 FE0E  ; text style;  # (5.2) FERRY
+26F4 FE0F  ; emoji style; # (5.2) FERRY
+26F5 FE0E  ; text style;  # (5.2) SAILBOAT
+26F5 FE0F  ; emoji style; # (5.2) SAILBOAT
+26F7 FE0E  ; text style;  # (5.2) SKIER
+26F7 FE0F  ; emoji style; # (5.2) SKIER
+26F8 FE0E  ; text style;  # (5.2) ICE SKATE
+26F8 FE0F  ; emoji style; # (5.2) ICE SKATE
+26F9 FE0E  ; text style;  # (5.2) PERSON WITH BALL
+26F9 FE0F  ; emoji style; # (5.2) PERSON WITH BALL
+26FA FE0E  ; text style;  # (5.2) TENT
+26FA FE0F  ; emoji style; # (5.2) TENT
+26FD FE0E  ; text style;  # (5.2) FUEL PUMP
+26FD FE0F  ; emoji style; # (5.2) FUEL PUMP
+2702 FE0E  ; text style;  # (1.1) BLACK SCISSORS
+2702 FE0F  ; emoji style; # (1.1) BLACK SCISSORS
+2708 FE0E  ; text style;  # (1.1) AIRPLANE
+2708 FE0F  ; emoji style; # (1.1) AIRPLANE
+2709 FE0E  ; text style;  # (1.1) ENVELOPE
+2709 FE0F  ; emoji style; # (1.1) ENVELOPE
+270C FE0E  ; text style;  # (1.1) VICTORY HAND
+270C FE0F  ; emoji style; # (1.1) VICTORY HAND
+270D FE0E  ; text style;  # (1.1) WRITING HAND
+270D FE0F  ; emoji style; # (1.1) WRITING HAND
+270F FE0E  ; text style;  # (1.1) PENCIL
+270F FE0F  ; emoji style; # (1.1) PENCIL
+2712 FE0E  ; text style;  # (1.1) BLACK NIB
+2712 FE0F  ; emoji style; # (1.1) BLACK NIB
+2714 FE0E  ; text style;  # (1.1) HEAVY CHECK MARK
+2714 FE0F  ; emoji style; # (1.1) HEAVY CHECK MARK
+2716 FE0E  ; text style;  # (1.1) HEAVY MULTIPLICATION X
+2716 FE0F  ; emoji style; # (1.1) HEAVY MULTIPLICATION X
+271D FE0E  ; text style;  # (1.1) LATIN CROSS
+271D FE0F  ; emoji style; # (1.1) LATIN CROSS
+2721 FE0E  ; text style;  # (1.1) STAR OF DAVID
+2721 FE0F  ; emoji style; # (1.1) STAR OF DAVID
+2733 FE0E  ; text style;  # (1.1) EIGHT SPOKED ASTERISK
+2733 FE0F  ; emoji style; # (1.1) EIGHT SPOKED ASTERISK
+2734 FE0E  ; text style;  # (1.1) EIGHT POINTED BLACK STAR
+2734 FE0F  ; emoji style; # (1.1) EIGHT POINTED BLACK STAR
+2744 FE0E  ; text style;  # (1.1) SNOWFLAKE
+2744 FE0F  ; emoji style; # (1.1) SNOWFLAKE
+2747 FE0E  ; text style;  # (1.1) SPARKLE
+2747 FE0F  ; emoji style; # (1.1) SPARKLE
+2753 FE0E  ; text style;  # (6.0) BLACK QUESTION MARK ORNAMENT
+2753 FE0F  ; emoji style; # (6.0) BLACK QUESTION MARK ORNAMENT
+2757 FE0E  ; text style;  # (5.2) HEAVY EXCLAMATION MARK SYMBOL
+2757 FE0F  ; emoji style; # (5.2) HEAVY EXCLAMATION MARK SYMBOL
+2763 FE0E  ; text style;  # (1.1) HEAVY HEART EXCLAMATION MARK ORNAMENT
+2763 FE0F  ; emoji style; # (1.1) HEAVY HEART EXCLAMATION MARK ORNAMENT
+2764 FE0E  ; text style;  # (1.1) HEAVY BLACK HEART
+2764 FE0F  ; emoji style; # (1.1) HEAVY BLACK HEART
+27A1 FE0E  ; text style;  # (1.1) BLACK RIGHTWARDS ARROW
+27A1 FE0F  ; emoji style; # (1.1) BLACK RIGHTWARDS ARROW
+2934 FE0E  ; text style;  # (3.2) ARROW POINTING RIGHTWARDS THEN CURVING UPWARDS
+2934 FE0F  ; emoji style; # (3.2) ARROW POINTING RIGHTWARDS THEN CURVING UPWARDS
+2935 FE0E  ; text style;  # (3.2) ARROW POINTING RIGHTWARDS THEN CURVING DOWNWARDS
+2935 FE0F  ; emoji style; # (3.2) ARROW POINTING RIGHTWARDS THEN CURVING DOWNWARDS
+2B05 FE0E  ; text style;  # (4.0) LEFTWARDS BLACK ARROW
+2B05 FE0F  ; emoji style; # (4.0) LEFTWARDS BLACK ARROW
+2B06 FE0E  ; text style;  # (4.0) UPWARDS BLACK ARROW
+2B06 FE0F  ; emoji style; # (4.0) UPWARDS BLACK ARROW
+2B07 FE0E  ; text style;  # (4.0) DOWNWARDS BLACK ARROW
+2B07 FE0F  ; emoji style; # (4.0) DOWNWARDS BLACK ARROW
+2B1B FE0E  ; text style;  # (5.1) BLACK LARGE SQUARE
+2B1B FE0F  ; emoji style; # (5.1) BLACK LARGE SQUARE
+2B1C FE0E  ; text style;  # (5.1) WHITE LARGE SQUARE
+2B1C FE0F  ; emoji style; # (5.1) WHITE LARGE SQUARE
+2B50 FE0E  ; text style;  # (5.1) WHITE MEDIUM STAR
+2B50 FE0F  ; emoji style; # (5.1) WHITE MEDIUM STAR
+2B55 FE0E  ; text style;  # (5.2) HEAVY LARGE CIRCLE
+2B55 FE0F  ; emoji style; # (5.2) HEAVY LARGE CIRCLE
+3030 FE0E  ; text style;  # (1.1) WAVY DASH
+3030 FE0F  ; emoji style; # (1.1) WAVY DASH
+303D FE0E  ; text style;  # (3.2) PART ALTERNATION MARK
+303D FE0F  ; emoji style; # (3.2) PART ALTERNATION MARK
+3297 FE0E  ; text style;  # (1.1) CIRCLED IDEOGRAPH CONGRATULATION
+3297 FE0F  ; emoji style; # (1.1) CIRCLED IDEOGRAPH CONGRATULATION
+3299 FE0E  ; text style;  # (1.1) CIRCLED IDEOGRAPH SECRET
+3299 FE0F  ; emoji style; # (1.1) CIRCLED IDEOGRAPH SECRET
+1F004 FE0E ; text style;  # (5.1) MAHJONG TILE RED DRAGON
+1F004 FE0F ; emoji style; # (5.1) MAHJONG TILE RED DRAGON
+1F170 FE0E ; text style;  # (6.0) NEGATIVE SQUARED LATIN CAPITAL LETTER A
+1F170 FE0F ; emoji style; # (6.0) NEGATIVE SQUARED LATIN CAPITAL LETTER A
+1F171 FE0E ; text style;  # (6.0) NEGATIVE SQUARED LATIN CAPITAL LETTER B
+1F171 FE0F ; emoji style; # (6.0) NEGATIVE SQUARED LATIN CAPITAL LETTER B
+1F17E FE0E ; text style;  # (6.0) NEGATIVE SQUARED LATIN CAPITAL LETTER O
+1F17E FE0F ; emoji style; # (6.0) NEGATIVE SQUARED LATIN CAPITAL LETTER O
+1F17F FE0E ; text style;  # (5.2) NEGATIVE SQUARED LATIN CAPITAL LETTER P
+1F17F FE0F ; emoji style; # (5.2) NEGATIVE SQUARED LATIN CAPITAL LETTER P
+1F202 FE0E ; text style;  # (6.0) SQUARED KATAKANA SA
+1F202 FE0F ; emoji style; # (6.0) SQUARED KATAKANA SA
+1F21A FE0E ; text style;  # (5.2) SQUARED CJK UNIFIED IDEOGRAPH-7121
+1F21A FE0F ; emoji style; # (5.2) SQUARED CJK UNIFIED IDEOGRAPH-7121
+1F22F FE0E ; text style;  # (5.2) SQUARED CJK UNIFIED IDEOGRAPH-6307
+1F22F FE0F ; emoji style; # (5.2) SQUARED CJK UNIFIED IDEOGRAPH-6307
+1F237 FE0E ; text style;  # (6.0) SQUARED CJK UNIFIED IDEOGRAPH-6708
+1F237 FE0F ; emoji style; # (6.0) SQUARED CJK UNIFIED IDEOGRAPH-6708
+1F30D FE0E ; text style;  # (6.0) EARTH GLOBE EUROPE-AFRICA
+1F30D FE0F ; emoji style; # (6.0) EARTH GLOBE EUROPE-AFRICA
+1F30E FE0E ; text style;  # (6.0) EARTH GLOBE AMERICAS
+1F30E FE0F ; emoji style; # (6.0) EARTH GLOBE AMERICAS
+1F30F FE0E ; text style;  # (6.0) EARTH GLOBE ASIA-AUSTRALIA
+1F30F FE0F ; emoji style; # (6.0) EARTH GLOBE ASIA-AUSTRALIA
+1F315 FE0E ; text style;  # (6.0) FULL MOON SYMBOL
+1F315 FE0F ; emoji style; # (6.0) FULL MOON SYMBOL
+1F31C FE0E ; text style;  # (6.0) LAST QUARTER MOON WITH FACE
+1F31C FE0F ; emoji style; # (6.0) LAST QUARTER MOON WITH FACE
+1F321 FE0E ; text style;  # (7.0) THERMOMETER
+1F321 FE0F ; emoji style; # (7.0) THERMOMETER
+1F324 FE0E ; text style;  # (7.0) WHITE SUN WITH SMALL CLOUD
+1F324 FE0F ; emoji style; # (7.0) WHITE SUN WITH SMALL CLOUD
+1F325 FE0E ; text style;  # (7.0) WHITE SUN BEHIND CLOUD
+1F325 FE0F ; emoji style; # (7.0) WHITE SUN BEHIND CLOUD
+1F326 FE0E ; text style;  # (7.0) WHITE SUN BEHIND CLOUD WITH RAIN
+1F326 FE0F ; emoji style; # (7.0) WHITE SUN BEHIND CLOUD WITH RAIN
+1F327 FE0E ; text style;  # (7.0) CLOUD WITH RAIN
+1F327 FE0F ; emoji style; # (7.0) CLOUD WITH RAIN
+1F328 FE0E ; text style;  # (7.0) CLOUD WITH SNOW
+1F328 FE0F ; emoji style; # (7.0) CLOUD WITH SNOW
+1F329 FE0E ; text style;  # (7.0) CLOUD WITH LIGHTNING
+1F329 FE0F ; emoji style; # (7.0) CLOUD WITH LIGHTNING
+1F32A FE0E ; text style;  # (7.0) CLOUD WITH TORNADO
+1F32A FE0F ; emoji style; # (7.0) CLOUD WITH TORNADO
+1F32B FE0E ; text style;  # (7.0) FOG
+1F32B FE0F ; emoji style; # (7.0) FOG
+1F32C FE0E ; text style;  # (7.0) WIND BLOWING FACE
+1F32C FE0F ; emoji style; # (7.0) WIND BLOWING FACE
+1F336 FE0E ; text style;  # (7.0) HOT PEPPER
+1F336 FE0F ; emoji style; # (7.0) HOT PEPPER
+1F378 FE0E ; text style;  # (6.0) COCKTAIL GLASS
+1F378 FE0F ; emoji style; # (6.0) COCKTAIL GLASS
+1F37D FE0E ; text style;  # (7.0) FORK AND KNIFE WITH PLATE
+1F37D FE0F ; emoji style; # (7.0) FORK AND KNIFE WITH PLATE
+1F393 FE0E ; text style;  # (6.0) GRADUATION CAP
+1F393 FE0F ; emoji style; # (6.0) GRADUATION CAP
+1F396 FE0E ; text style;  # (7.0) MILITARY MEDAL
+1F396 FE0F ; emoji style; # (7.0) MILITARY MEDAL
+1F397 FE0E ; text style;  # (7.0) REMINDER RIBBON
+1F397 FE0F ; emoji style; # (7.0) REMINDER RIBBON
+1F399 FE0E ; text style;  # (7.0) STUDIO MICROPHONE
+1F399 FE0F ; emoji style; # (7.0) STUDIO MICROPHONE
+1F39A FE0E ; text style;  # (7.0) LEVEL SLIDER
+1F39A FE0F ; emoji style; # (7.0) LEVEL SLIDER
+1F39B FE0E ; text style;  # (7.0) CONTROL KNOBS
+1F39B FE0F ; emoji style; # (7.0) CONTROL KNOBS
+1F39E FE0E ; text style;  # (7.0) FILM FRAMES
+1F39E FE0F ; emoji style; # (7.0) FILM FRAMES
+1F39F FE0E ; text style;  # (7.0) ADMISSION TICKETS
+1F39F FE0F ; emoji style; # (7.0) ADMISSION TICKETS
+1F3A7 FE0E ; text style;  # (6.0) HEADPHONE
+1F3A7 FE0F ; emoji style; # (6.0) HEADPHONE
+1F3AC FE0E ; text style;  # (6.0) CLAPPER BOARD
+1F3AC FE0F ; emoji style; # (6.0) CLAPPER BOARD
+1F3AD FE0E ; text style;  # (6.0) PERFORMING ARTS
+1F3AD FE0F ; emoji style; # (6.0) PERFORMING ARTS
+1F3AE FE0E ; text style;  # (6.0) VIDEO GAME
+1F3AE FE0F ; emoji style; # (6.0) VIDEO GAME
+1F3C2 FE0E ; text style;  # (6.0) SNOWBOARDER
+1F3C2 FE0F ; emoji style; # (6.0) SNOWBOARDER
+1F3C4 FE0E ; text style;  # (6.0) SURFER
+1F3C4 FE0F ; emoji style; # (6.0) SURFER
+1F3C6 FE0E ; text style;  # (6.0) TROPHY
+1F3C6 FE0F ; emoji style; # (6.0) TROPHY
+1F3CA FE0E ; text style;  # (6.0) SWIMMER
+1F3CA FE0F ; emoji style; # (6.0) SWIMMER
+1F3CB FE0E ; text style;  # (7.0) WEIGHT LIFTER
+1F3CB FE0F ; emoji style; # (7.0) WEIGHT LIFTER
+1F3CC FE0E ; text style;  # (7.0) GOLFER
+1F3CC FE0F ; emoji style; # (7.0) GOLFER
+1F3CD FE0E ; text style;  # (7.0) RACING MOTORCYCLE
+1F3CD FE0F ; emoji style; # (7.0) RACING MOTORCYCLE
+1F3CE FE0E ; text style;  # (7.0) RACING CAR
+1F3CE FE0F ; emoji style; # (7.0) RACING CAR
+1F3D4 FE0E ; text style;  # (7.0) SNOW CAPPED MOUNTAIN
+1F3D4 FE0F ; emoji style; # (7.0) SNOW CAPPED MOUNTAIN
+1F3D5 FE0E ; text style;  # (7.0) CAMPING
+1F3D5 FE0F ; emoji style; # (7.0) CAMPING
+1F3D6 FE0E ; text style;  # (7.0) BEACH WITH UMBRELLA
+1F3D6 FE0F ; emoji style; # (7.0) BEACH WITH UMBRELLA
+1F3D7 FE0E ; text style;  # (7.0) BUILDING CONSTRUCTION
+1F3D7 FE0F ; emoji style; # (7.0) BUILDING CONSTRUCTION
+1F3D8 FE0E ; text style;  # (7.0) HOUSE BUILDINGS
+1F3D8 FE0F ; emoji style; # (7.0) HOUSE BUILDINGS
+1F3D9 FE0E ; text style;  # (7.0) CITYSCAPE
+1F3D9 FE0F ; emoji style; # (7.0) CITYSCAPE
+1F3DA FE0E ; text style;  # (7.0) DERELICT HOUSE BUILDING
+1F3DA FE0F ; emoji style; # (7.0) DERELICT HOUSE BUILDING
+1F3DB FE0E ; text style;  # (7.0) CLASSICAL BUILDING
+1F3DB FE0F ; emoji style; # (7.0) CLASSICAL BUILDING
+1F3DC FE0E ; text style;  # (7.0) DESERT
+1F3DC FE0F ; emoji style; # (7.0) DESERT
+1F3DD FE0E ; text style;  # (7.0) DESERT ISLAND
+1F3DD FE0F ; emoji style; # (7.0) DESERT ISLAND
+1F3DE FE0E ; text style;  # (7.0) NATIONAL PARK
+1F3DE FE0F ; emoji style; # (7.0) NATIONAL PARK
+1F3DF FE0E ; text style;  # (7.0) STADIUM
+1F3DF FE0F ; emoji style; # (7.0) STADIUM
+1F3E0 FE0E ; text style;  # (6.0) HOUSE BUILDING
+1F3E0 FE0F ; emoji style; # (6.0) HOUSE BUILDING
+1F3ED FE0E ; text style;  # (6.0) FACTORY
+1F3ED FE0F ; emoji style; # (6.0) FACTORY
+1F3F3 FE0E ; text style;  # (7.0) WAVING WHITE FLAG
+1F3F3 FE0F ; emoji style; # (7.0) WAVING WHITE FLAG
+1F3F5 FE0E ; text style;  # (7.0) ROSETTE
+1F3F5 FE0F ; emoji style; # (7.0) ROSETTE
+1F3F7 FE0E ; text style;  # (7.0) LABEL
+1F3F7 FE0F ; emoji style; # (7.0) LABEL
+1F408 FE0E ; text style;  # (6.0) CAT
+1F408 FE0F ; emoji style; # (6.0) CAT
+1F415 FE0E ; text style;  # (6.0) DOG
+1F415 FE0F ; emoji style; # (6.0) DOG
+1F41F FE0E ; text style;  # (6.0) FISH
+1F41F FE0F ; emoji style; # (6.0) FISH
+1F426 FE0E ; text style;  # (6.0) BIRD
+1F426 FE0F ; emoji style; # (6.0) BIRD
+1F43F FE0E ; text style;  # (7.0) CHIPMUNK
+1F43F FE0F ; emoji style; # (7.0) CHIPMUNK
+1F441 FE0E ; text style;  # (7.0) EYE
+1F441 FE0F ; emoji style; # (7.0) EYE
+1F442 FE0E ; text style;  # (6.0) EAR
+1F442 FE0F ; emoji style; # (6.0) EAR
+1F446 FE0E ; text style;  # (6.0) WHITE UP POINTING BACKHAND INDEX
+1F446 FE0F ; emoji style; # (6.0) WHITE UP POINTING BACKHAND INDEX
+1F447 FE0E ; text style;  # (6.0) WHITE DOWN POINTING BACKHAND INDEX
+1F447 FE0F ; emoji style; # (6.0) WHITE DOWN POINTING BACKHAND INDEX
+1F448 FE0E ; text style;  # (6.0) WHITE LEFT POINTING BACKHAND INDEX
+1F448 FE0F ; emoji style; # (6.0) WHITE LEFT POINTING BACKHAND INDEX
+1F449 FE0E ; text style;  # (6.0) WHITE RIGHT POINTING BACKHAND INDEX
+1F449 FE0F ; emoji style; # (6.0) WHITE RIGHT POINTING BACKHAND INDEX
+1F44D FE0E ; text style;  # (6.0) THUMBS UP SIGN
+1F44D FE0F ; emoji style; # (6.0) THUMBS UP SIGN
+1F44E FE0E ; text style;  # (6.0) THUMBS DOWN SIGN
+1F44E FE0F ; emoji style; # (6.0) THUMBS DOWN SIGN
+1F453 FE0E ; text style;  # (6.0) EYEGLASSES
+1F453 FE0F ; emoji style; # (6.0) EYEGLASSES
+1F46A FE0E ; text style;  # (6.0) FAMILY
+1F46A FE0F ; emoji style; # (6.0) FAMILY
+1F47D FE0E ; text style;  # (6.0) EXTRATERRESTRIAL ALIEN
+1F47D FE0F ; emoji style; # (6.0) EXTRATERRESTRIAL ALIEN
+1F4A3 FE0E ; text style;  # (6.0) BOMB
+1F4A3 FE0F ; emoji style; # (6.0) BOMB
+1F4B0 FE0E ; text style;  # (6.0) MONEY BAG
+1F4B0 FE0F ; emoji style; # (6.0) MONEY BAG
+1F4B3 FE0E ; text style;  # (6.0) CREDIT CARD
+1F4B3 FE0F ; emoji style; # (6.0) CREDIT CARD
+1F4BB FE0E ; text style;  # (6.0) PERSONAL COMPUTER
+1F4BB FE0F ; emoji style; # (6.0) PERSONAL COMPUTER
+1F4BF FE0E ; text style;  # (6.0) OPTICAL DISC
+1F4BF FE0F ; emoji style; # (6.0) OPTICAL DISC
+1F4CB FE0E ; text style;  # (6.0) CLIPBOARD
+1F4CB FE0F ; emoji style; # (6.0) CLIPBOARD
+1F4DA FE0E ; text style;  # (6.0) BOOKS
+1F4DA FE0F ; emoji style; # (6.0) BOOKS
+1F4DF FE0E ; text style;  # (6.0) PAGER
+1F4DF FE0F ; emoji style; # (6.0) PAGER
+1F4E4 FE0E ; text style;  # (6.0) OUTBOX TRAY
+1F4E4 FE0F ; emoji style; # (6.0) OUTBOX TRAY
+1F4E5 FE0E ; text style;  # (6.0) INBOX TRAY
+1F4E5 FE0F ; emoji style; # (6.0) INBOX TRAY
+1F4E6 FE0E ; text style;  # (6.0) PACKAGE
+1F4E6 FE0F ; emoji style; # (6.0) PACKAGE
+1F4EA FE0E ; text style;  # (6.0) CLOSED MAILBOX WITH LOWERED FLAG
+1F4EA FE0F ; emoji style; # (6.0) CLOSED MAILBOX WITH LOWERED FLAG
+1F4EB FE0E ; text style;  # (6.0) CLOSED MAILBOX WITH RAISED FLAG
+1F4EB FE0F ; emoji style; # (6.0) CLOSED MAILBOX WITH RAISED FLAG
+1F4EC FE0E ; text style;  # (6.0) OPEN MAILBOX WITH RAISED FLAG
+1F4EC FE0F ; emoji style; # (6.0) OPEN MAILBOX WITH RAISED FLAG
+1F4ED FE0E ; text style;  # (6.0) OPEN MAILBOX WITH LOWERED FLAG
+1F4ED FE0F ; emoji style; # (6.0) OPEN MAILBOX WITH LOWERED FLAG
+1F4F7 FE0E ; text style;  # (6.0) CAMERA
+1F4F7 FE0F ; emoji style; # (6.0) CAMERA
+1F4F9 FE0E ; text style;  # (6.0) VIDEO CAMERA
+1F4F9 FE0F ; emoji style; # (6.0) VIDEO CAMERA
+1F4FA FE0E ; text style;  # (6.0) TELEVISION
+1F4FA FE0F ; emoji style; # (6.0) TELEVISION
+1F4FB FE0E ; text style;  # (6.0) RADIO
+1F4FB FE0F ; emoji style; # (6.0) RADIO
+1F4FD FE0E ; text style;  # (7.0) FILM PROJECTOR
+1F4FD FE0F ; emoji style; # (7.0) FILM PROJECTOR
+1F508 FE0E ; text style;  # (6.0) SPEAKER
+1F508 FE0F ; emoji style; # (6.0) SPEAKER
+1F50D FE0E ; text style;  # (6.0) LEFT-POINTING MAGNIFYING GLASS
+1F50D FE0F ; emoji style; # (6.0) LEFT-POINTING MAGNIFYING GLASS
+1F512 FE0E ; text style;  # (6.0) LOCK
+1F512 FE0F ; emoji style; # (6.0) LOCK
+1F513 FE0E ; text style;  # (6.0) OPEN LOCK
+1F513 FE0F ; emoji style; # (6.0) OPEN LOCK
+1F549 FE0E ; text style;  # (7.0) OM SYMBOL
+1F549 FE0F ; emoji style; # (7.0) OM SYMBOL
+1F54A FE0E ; text style;  # (7.0) DOVE OF PEACE
+1F54A FE0F ; emoji style; # (7.0) DOVE OF PEACE
+1F550 FE0E ; text style;  # (6.0) CLOCK FACE ONE OCLOCK
+1F550 FE0F ; emoji style; # (6.0) CLOCK FACE ONE OCLOCK
+1F551 FE0E ; text style;  # (6.0) CLOCK FACE TWO OCLOCK
+1F551 FE0F ; emoji style; # (6.0) CLOCK FACE TWO OCLOCK
+1F552 FE0E ; text style;  # (6.0) CLOCK FACE THREE OCLOCK
+1F552 FE0F ; emoji style; # (6.0) CLOCK FACE THREE OCLOCK
+1F553 FE0E ; text style;  # (6.0) CLOCK FACE FOUR OCLOCK
+1F553 FE0F ; emoji style; # (6.0) CLOCK FACE FOUR OCLOCK
+1F554 FE0E ; text style;  # (6.0) CLOCK FACE FIVE OCLOCK
+1F554 FE0F ; emoji style; # (6.0) CLOCK FACE FIVE OCLOCK
+1F555 FE0E ; text style;  # (6.0) CLOCK FACE SIX OCLOCK
+1F555 FE0F ; emoji style; # (6.0) CLOCK FACE SIX OCLOCK
+1F556 FE0E ; text style;  # (6.0) CLOCK FACE SEVEN OCLOCK
+1F556 FE0F ; emoji style; # (6.0) CLOCK FACE SEVEN OCLOCK
+1F557 FE0E ; text style;  # (6.0) CLOCK FACE EIGHT OCLOCK
+1F557 FE0F ; emoji style; # (6.0) CLOCK FACE EIGHT OCLOCK
+1F558 FE0E ; text style;  # (6.0) CLOCK FACE NINE OCLOCK
+1F558 FE0F ; emoji style; # (6.0) CLOCK FACE NINE OCLOCK
+1F559 FE0E ; text style;  # (6.0) CLOCK FACE TEN OCLOCK
+1F559 FE0F ; emoji style; # (6.0) CLOCK FACE TEN OCLOCK
+1F55A FE0E ; text style;  # (6.0) CLOCK FACE ELEVEN OCLOCK
+1F55A FE0F ; emoji style; # (6.0) CLOCK FACE ELEVEN OCLOCK
+1F55B FE0E ; text style;  # (6.0) CLOCK FACE TWELVE OCLOCK
+1F55B FE0F ; emoji style; # (6.0) CLOCK FACE TWELVE OCLOCK
+1F55C FE0E ; text style;  # (6.0) CLOCK FACE ONE-THIRTY
+1F55C FE0F ; emoji style; # (6.0) CLOCK FACE ONE-THIRTY
+1F55D FE0E ; text style;  # (6.0) CLOCK FACE TWO-THIRTY
+1F55D FE0F ; emoji style; # (6.0) CLOCK FACE TWO-THIRTY
+1F55E FE0E ; text style;  # (6.0) CLOCK FACE THREE-THIRTY
+1F55E FE0F ; emoji style; # (6.0) CLOCK FACE THREE-THIRTY
+1F55F FE0E ; text style;  # (6.0) CLOCK FACE FOUR-THIRTY
+1F55F FE0F ; emoji style; # (6.0) CLOCK FACE FOUR-THIRTY
+1F560 FE0E ; text style;  # (6.0) CLOCK FACE FIVE-THIRTY
+1F560 FE0F ; emoji style; # (6.0) CLOCK FACE FIVE-THIRTY
+1F561 FE0E ; text style;  # (6.0) CLOCK FACE SIX-THIRTY
+1F561 FE0F ; emoji style; # (6.0) CLOCK FACE SIX-THIRTY
+1F562 FE0E ; text style;  # (6.0) CLOCK FACE SEVEN-THIRTY
+1F562 FE0F ; emoji style; # (6.0) CLOCK FACE SEVEN-THIRTY
+1F563 FE0E ; text style;  # (6.0) CLOCK FACE EIGHT-THIRTY
+1F563 FE0F ; emoji style; # (6.0) CLOCK FACE EIGHT-THIRTY
+1F564 FE0E ; text style;  # (6.0) CLOCK FACE NINE-THIRTY
+1F564 FE0F ; emoji style; # (6.0) CLOCK FACE NINE-THIRTY
+1F565 FE0E ; text style;  # (6.0) CLOCK FACE TEN-THIRTY
+1F565 FE0F ; emoji style; # (6.0) CLOCK FACE TEN-THIRTY
+1F566 FE0E ; text style;  # (6.0) CLOCK FACE ELEVEN-THIRTY
+1F566 FE0F ; emoji style; # (6.0) CLOCK FACE ELEVEN-THIRTY
+1F567 FE0E ; text style;  # (6.0) CLOCK FACE TWELVE-THIRTY
+1F567 FE0F ; emoji style; # (6.0) CLOCK FACE TWELVE-THIRTY
+1F56F FE0E ; text style;  # (7.0) CANDLE
+1F56F FE0F ; emoji style; # (7.0) CANDLE
+1F570 FE0E ; text style;  # (7.0) MANTELPIECE CLOCK
+1F570 FE0F ; emoji style; # (7.0) MANTELPIECE CLOCK
+1F573 FE0E ; text style;  # (7.0) HOLE
+1F573 FE0F ; emoji style; # (7.0) HOLE
+1F574 FE0E ; text style;  # (7.0) MAN IN BUSINESS SUIT LEVITATING
+1F574 FE0F ; emoji style; # (7.0) MAN IN BUSINESS SUIT LEVITATING
+1F575 FE0E ; text style;  # (7.0) SLEUTH OR SPY
+1F575 FE0F ; emoji style; # (7.0) SLEUTH OR SPY
+1F576 FE0E ; text style;  # (7.0) DARK SUNGLASSES
+1F576 FE0F ; emoji style; # (7.0) DARK SUNGLASSES
+1F577 FE0E ; text style;  # (7.0) SPIDER
+1F577 FE0F ; emoji style; # (7.0) SPIDER
+1F578 FE0E ; text style;  # (7.0) SPIDER WEB
+1F578 FE0F ; emoji style; # (7.0) SPIDER WEB
+1F579 FE0E ; text style;  # (7.0) JOYSTICK
+1F579 FE0F ; emoji style; # (7.0) JOYSTICK
+1F587 FE0E ; text style;  # (7.0) LINKED PAPERCLIPS
+1F587 FE0F ; emoji style; # (7.0) LINKED PAPERCLIPS
+1F58A FE0E ; text style;  # (7.0) LOWER LEFT BALLPOINT PEN
+1F58A FE0F ; emoji style; # (7.0) LOWER LEFT BALLPOINT PEN
+1F58B FE0E ; text style;  # (7.0) LOWER LEFT FOUNTAIN PEN
+1F58B FE0F ; emoji style; # (7.0) LOWER LEFT FOUNTAIN PEN
+1F58C FE0E ; text style;  # (7.0) LOWER LEFT PAINTBRUSH
+1F58C FE0F ; emoji style; # (7.0) LOWER LEFT PAINTBRUSH
+1F58D FE0E ; text style;  # (7.0) LOWER LEFT CRAYON
+1F58D FE0F ; emoji style; # (7.0) LOWER LEFT CRAYON
+1F590 FE0E ; text style;  # (7.0) RAISED HAND WITH FINGERS SPLAYED
+1F590 FE0F ; emoji style; # (7.0) RAISED HAND WITH FINGERS SPLAYED
+1F5A5 FE0E ; text style;  # (7.0) DESKTOP COMPUTER
+1F5A5 FE0F ; emoji style; # (7.0) DESKTOP COMPUTER
+1F5A8 FE0E ; text style;  # (7.0) PRINTER
+1F5A8 FE0F ; emoji style; # (7.0) PRINTER
+1F5B1 FE0E ; text style;  # (7.0) THREE BUTTON MOUSE
+1F5B1 FE0F ; emoji style; # (7.0) THREE BUTTON MOUSE
+1F5B2 FE0E ; text style;  # (7.0) TRACKBALL
+1F5B2 FE0F ; emoji style; # (7.0) TRACKBALL
+1F5BC FE0E ; text style;  # (7.0) FRAME WITH PICTURE
+1F5BC FE0F ; emoji style; # (7.0) FRAME WITH PICTURE
+1F5C2 FE0E ; text style;  # (7.0) CARD INDEX DIVIDERS
+1F5C2 FE0F ; emoji style; # (7.0) CARD INDEX DIVIDERS
+1F5C3 FE0E ; text style;  # (7.0) CARD FILE BOX
+1F5C3 FE0F ; emoji style; # (7.0) CARD FILE BOX
+1F5C4 FE0E ; text style;  # (7.0) FILE CABINET
+1F5C4 FE0F ; emoji style; # (7.0) FILE CABINET
+1F5D1 FE0E ; text style;  # (7.0) WASTEBASKET
+1F5D1 FE0F ; emoji style; # (7.0) WASTEBASKET
+1F5D2 FE0E ; text style;  # (7.0) SPIRAL NOTE PAD
+1F5D2 FE0F ; emoji style; # (7.0) SPIRAL NOTE PAD
+1F5D3 FE0E ; text style;  # (7.0) SPIRAL CALENDAR PAD
+1F5D3 FE0F ; emoji style; # (7.0) SPIRAL CALENDAR PAD
+1F5DC FE0E ; text style;  # (7.0) COMPRESSION
+1F5DC FE0F ; emoji style; # (7.0) COMPRESSION
+1F5DD FE0E ; text style;  # (7.0) OLD KEY
+1F5DD FE0F ; emoji style; # (7.0) OLD KEY
+1F5DE FE0E ; text style;  # (7.0) ROLLED-UP NEWSPAPER
+1F5DE FE0F ; emoji style; # (7.0) ROLLED-UP NEWSPAPER
+1F5E1 FE0E ; text style;  # (7.0) DAGGER KNIFE
+1F5E1 FE0F ; emoji style; # (7.0) DAGGER KNIFE
+1F5E3 FE0E ; text style;  # (7.0) SPEAKING HEAD IN SILHOUETTE
+1F5E3 FE0F ; emoji style; # (7.0) SPEAKING HEAD IN SILHOUETTE
+1F5E8 FE0E ; text style;  # (7.0) LEFT SPEECH BUBBLE
+1F5E8 FE0F ; emoji style; # (7.0) LEFT SPEECH BUBBLE
+1F5EF FE0E ; text style;  # (7.0) RIGHT ANGER BUBBLE
+1F5EF FE0F ; emoji style; # (7.0) RIGHT ANGER BUBBLE
+1F5F3 FE0E ; text style;  # (7.0) BALLOT BOX WITH BALLOT
+1F5F3 FE0F ; emoji style; # (7.0) BALLOT BOX WITH BALLOT
+1F5FA FE0E ; text style;  # (7.0) WORLD MAP
+1F5FA FE0F ; emoji style; # (7.0) WORLD MAP
+1F610 FE0E ; text style;  # (6.0) NEUTRAL FACE
+1F610 FE0F ; emoji style; # (6.0) NEUTRAL FACE
+1F687 FE0E ; text style;  # (6.0) METRO
+1F687 FE0F ; emoji style; # (6.0) METRO
+1F68D FE0E ; text style;  # (6.0) ONCOMING BUS
+1F68D FE0F ; emoji style; # (6.0) ONCOMING BUS
+1F691 FE0E ; text style;  # (6.0) AMBULANCE
+1F691 FE0F ; emoji style; # (6.0) AMBULANCE
+1F694 FE0E ; text style;  # (6.0) ONCOMING POLICE CAR
+1F694 FE0F ; emoji style; # (6.0) ONCOMING POLICE CAR
+1F698 FE0E ; text style;  # (6.0) ONCOMING AUTOMOBILE
+1F698 FE0F ; emoji style; # (6.0) ONCOMING AUTOMOBILE
+1F6AD FE0E ; text style;  # (6.0) NO SMOKING SYMBOL
+1F6AD FE0F ; emoji style; # (6.0) NO SMOKING SYMBOL
+1F6B2 FE0E ; text style;  # (6.0) BICYCLE
+1F6B2 FE0F ; emoji style; # (6.0) BICYCLE
+1F6B9 FE0E ; text style;  # (6.0) MENS SYMBOL
+1F6B9 FE0F ; emoji style; # (6.0) MENS SYMBOL
+1F6BA FE0E ; text style;  # (6.0) WOMENS SYMBOL
+1F6BA FE0F ; emoji style; # (6.0) WOMENS SYMBOL
+1F6BC FE0E ; text style;  # (6.0) BABY SYMBOL
+1F6BC FE0F ; emoji style; # (6.0) BABY SYMBOL
+1F6CB FE0E ; text style;  # (7.0) COUCH AND LAMP
+1F6CB FE0F ; emoji style; # (7.0) COUCH AND LAMP
+1F6CD FE0E ; text style;  # (7.0) SHOPPING BAGS
+1F6CD FE0F ; emoji style; # (7.0) SHOPPING BAGS
+1F6CE FE0E ; text style;  # (7.0) BELLHOP BELL
+1F6CE FE0F ; emoji style; # (7.0) BELLHOP BELL
+1F6CF FE0E ; text style;  # (7.0) BED
+1F6CF FE0F ; emoji style; # (7.0) BED
+1F6E0 FE0E ; text style;  # (7.0) HAMMER AND WRENCH
+1F6E0 FE0F ; emoji style; # (7.0) HAMMER AND WRENCH
+1F6E1 FE0E ; text style;  # (7.0) SHIELD
+1F6E1 FE0F ; emoji style; # (7.0) SHIELD
+1F6E2 FE0E ; text style;  # (7.0) OIL DRUM
+1F6E2 FE0F ; emoji style; # (7.0) OIL DRUM
+1F6E3 FE0E ; text style;  # (7.0) MOTORWAY
+1F6E3 FE0F ; emoji style; # (7.0) MOTORWAY
+1F6E4 FE0E ; text style;  # (7.0) RAILWAY TRACK
+1F6E4 FE0F ; emoji style; # (7.0) RAILWAY TRACK
+1F6E5 FE0E ; text style;  # (7.0) MOTOR BOAT
+1F6E5 FE0F ; emoji style; # (7.0) MOTOR BOAT
+1F6E9 FE0E ; text style;  # (7.0) SMALL AIRPLANE
+1F6E9 FE0F ; emoji style; # (7.0) SMALL AIRPLANE
+1F6F0 FE0E ; text style;  # (7.0) SATELLITE
+1F6F0 FE0F ; emoji style; # (7.0) SATELLITE
+1F6F3 FE0E ; text style;  # (7.0) PASSENGER SHIP
+1F6F3 FE0F ; emoji style; # (7.0) PASSENGER SHIP
+
+#Total sequences: 354
+
+#EOF
diff --git a/admin/unidata/emoji-zwj.awk b/admin/unidata/emoji-zwj.awk
index 7d2ff6cb900..c2ee3f2118e 100644
--- a/admin/unidata/emoji-zwj.awk
+++ b/admin/unidata/emoji-zwj.awk
@@ -60,35 +60,25 @@
     vec[elts[1]] = vec[elts[1]] "\""
 }
 
+# The following codepoints may or may not be emoji, but they are part
+# of emoji sequences.  We have code in font.c:font_range that will try
+# to display them with the emoji font anyway.
+/^[0-9A-F]+ FE0F *; emoji style;/ {
+    sub(/ *FE0F .*/, "", $0)
+    trigger_codepoints[$0] = $0
+}
+
 END {
      print ";;; emoji-zwj.el --- emoji zwj character composition table  -*- lexical-binding:t -*-"
      print ";;; Automatically generated from admin/unidata/emoji-{zwj-,}sequences.txt"
      print "(eval-when-compile (require 'regexp-opt))"
 
-     # The following codepoints are not emoji, but they are part of
-     # emoji sequences.  We have code in font.c:font_range that will
-     # try to display them with the emoji font anyway.
-
-     trigger_codepoints[1] = "261D"
-     trigger_codepoints[2] = "26F9"
-     trigger_codepoints[3] = "270C"
-     trigger_codepoints[4] = "270D"
-     trigger_codepoints[5] = "2764"
-     trigger_codepoints[6] = "1F3CB"
-     trigger_codepoints[7] = "1F3CC"
-     trigger_codepoints[8] = "1F3F3"
-     trigger_codepoints[9] = "1F3F4"
-     trigger_codepoints[10] = "1F441"
-     trigger_codepoints[11] = "1F574"
-     trigger_codepoints[12] = "1F575"
-     trigger_codepoints[13] = "1F590"
-
      printf "(setq auto-composition-emoji-eligible-codepoints\n"
      printf "'("
 
      for (trig in trigger_codepoints)
      {
-         printf("\n?\\N{U+%s}", trigger_codepoints[trig])
+         printf("\n?\\N{U+%s}", trig)
      }
      printf "\n))\n\n"
 
@@ -97,9 +87,8 @@ END {
 
      for (trig in trigger_codepoints)
      {
-         codepoint = trigger_codepoints[trig]
-         c = sprintf("\\N{U+%s}", codepoint)
-         vec[codepoint] = vec[codepoint] "\n\"" c "\\N{U+FE0F}\""
+         c = sprintf("\\N{U+%s}", trig)
+         vec[trig] = vec[trig] "\n\"" c "\\N{U+FE0F}\""
      }
 
      print "(dolist (elt `("
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26  3:18 bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate Steven Allen
@ 2023-05-26  6:41 ` Eli Zaretskii
  2023-05-26  8:34   ` Robert Pluim
  2023-05-26 15:06   ` Steven Allen
  0 siblings, 2 replies; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-26  6:41 UTC (permalink / raw)
  To: Steven Allen, Robert Pluim; +Cc: 63731

> From: Steven Allen <steven@stebalien.com>
> Date: Thu, 25 May 2023 20:18:02 -0700
> 
> This patch imports the full list from unicode.org instead of
> special-casing a few characters as was done previously.
> 
> With this patch, '👍️' (1F44D FE0F) should look the same as '👍' (1F44D).
> Without it, it will look like '👍‌️'.
> 
> As a simple regression test, '✔' (2714) should still as "text" while '✔️'
> (2714 FE0F) should still display as an emoji.
> 
> Fixes https://github.com/alphapapa/ement.el/issues/137
> 
> NOTE: I'm not a Unicode expert, nor do I understand how Emacs handles
> Unicode (beyond what was required to implement this patch). But this
> patch appears to work and I can't find any regressions.

AFAIU, this change will populate composition-function-table for many
"normal" characters, including ASCII digits and symbol/punctuation
characters from the 0x2xxx blocks.  E.g., after you build Emacs with
this patch, what do the following evaluations yield:

  M-: (aref composition-function-table ?0) RET
  M-: (aref composition-function-table #x2122) RET

If they yield non-nil values, it could mean dramatic slowdown of
redisplay with these characters.  Which is precisely what we wanted to
avoid when we made the decision which parts of the Unicode-defined
Emoji sequences to support in Emacs, and how to arrange for that
support to work.

The issue you site is strange: according to the "C-u C-x =" display
there, Emacs did compose #x1f44d with VS-16 using the Noto Color Emoji
font, so I don't quite understand why VS-16 is then also shown as an
empty rectangle.  On my system Noto Color Emoji doesn't work, and "C-u
C-x =" says this instead:

  Composed with the following character(s) "️" using this font:
    harfbuzz:-outline-Noto Emoji-regular-normal-normal-mono-15-*-*-*-c-*-iso10646-1
  by these glyphs:
    [0 1 128077 422 19 2 17 14 2 nil]
    [0 1 65039 3 19 0 1 0 1 [0 0 0]]
  with these character(s):
    ️ (#xfe0f) VARIATION SELECTOR-16

which explains why I see two glyphs and not 1.  But in the display
shown in the above issue, I see

  Composed with the following character(s) "️" using this font:
    ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-18-*-*-*-m-0-iso10646-1
  by these glyphs:
    [0 1 128077 569 22 0 23 17 5 [0 0 136]]
  with these character(s):
    ️ (#xfe0f) VARIATION SELECTOR-16

which describes only one glyph, not two.  So the result ought to be
what you expect.

Robert, what am I missing here?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26  6:41 ` Eli Zaretskii
@ 2023-05-26  8:34   ` Robert Pluim
  2023-05-26  8:46     ` Eli Zaretskii
  2023-05-26 15:06   ` Steven Allen
  1 sibling, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-26  8:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, Steven Allen


Disclaimer: I havenʼt looked at the patch yet

>>>>> On Fri, 26 May 2023 09:41:42 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Steven Allen <steven@stebalien.com>
    >> Date: Thu, 25 May 2023 20:18:02 -0700
    >> 
    >> This patch imports the full list from unicode.org instead of
    >> special-casing a few characters as was done previously.
    >> 
    >> With this patch, '👍️' (1F44D FE0F) should look the same as '👍' (1F44D).
    >> Without it, it will look like '👍‌️'.
    >> 
    >> As a simple regression test, '✔' (2714) should still as "text" while '✔️'
    >> (2714 FE0F) should still display as an emoji.
    >> 
    >> Fixes https://github.com/alphapapa/ement.el/issues/137
    >> 
    >> NOTE: I'm not a Unicode expert, nor do I understand how Emacs handles
    >> Unicode (beyond what was required to implement this patch). But this
    >> patch appears to work and I can't find any regressions.

    Eli> AFAIU, this change will populate composition-function-table for many
    Eli> "normal" characters, including ASCII digits and symbol/punctuation
    Eli> characters from the 0x2xxx blocks.  E.g., after you build Emacs with
    Eli> this patch, what do the following evaluations yield:

    Eli>   M-: (aref composition-function-table ?0) RET
    Eli>   M-: (aref composition-function-table #x2122) RET

    Eli> If they yield non-nil values, it could mean dramatic slowdown of
    Eli> redisplay with these characters.  Which is precisely what we wanted to
    Eli> avoid when we made the decision which parts of the Unicode-defined
    Eli> Emoji sequences to support in Emacs, and how to arrange for that
    Eli> support to work.

Yes. We donʼt want to do composition checks for ASCII if we can avoid it.

    Eli> The issue you site is strange: according to the "C-u C-x =" display
    Eli> there, Emacs did compose #x1f44d with VS-16 using the Noto Color Emoji
    Eli> font, so I don't quite understand why VS-16 is then also shown as an
    Eli> empty rectangle.  On my system Noto Color Emoji doesn't work, and "C-u
    Eli> C-x =" says this instead:

    Eli>   Composed with the following character(s) "️" using this font:
    Eli>     harfbuzz:-outline-Noto Emoji-regular-normal-normal-mono-15-*-*-*-c-*-iso10646-1
    Eli>   by these glyphs:
    Eli>     [0 1 128077 422 19 2 17 14 2 nil]
    Eli>     [0 1 65039 3 19 0 1 0 1 [0 0 0]]
    Eli>   with these character(s):
    Eli>     ️ (#xfe0f) VARIATION SELECTOR-16

    Eli> which explains why I see two glyphs and not 1.  But in the display
    Eli> shown in the above issue, I see

    Eli>   Composed with the following character(s) "️" using this font:
    Eli>     ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-18-*-*-*-m-0-iso10646-1
    Eli>   by these glyphs:
    Eli>     [0 1 128077 569 22 0 23 17 5 [0 0 136]]
    Eli>   with these character(s):
    Eli>     ️ (#xfe0f) VARIATION SELECTOR-16

    Eli> which describes only one glyph, not two.  So the result ought to be
    Eli> what you expect.

I see the emoji followed by a blank box with Noto Color Emoji here. I
donʼt yet understand why.

    Eli> Robert, what am I missing here?

1F44D FE0F is a valid sequence according to tr51

(aref composition-function-table #x1f44d)
=> (["\\(?:👍[🏻-🏿]\\)" 0 compose-gstring-for-graphic])

which means that the composition is being triggered by this entry:

(aref composition-function-table #xfe0f)
=> (["\\c.\\c^+" 1 compose-gstring-for-graphic] [nil 0 compose-gstring-for-graphic])

(time passes)

Ugh. The following fixes it for me:

diff --git a/lisp/composite.el b/lisp/composite.el
index fb8b76114f4..af86d1436d3 100644
--- a/lisp/composite.el
+++ b/lisp/composite.el
@@ -756,7 +756,7 @@ compose-gstring-for-dotted-circle
 ;; Allow for bootstrapping without uni-*.el.
 (when unicode-category-table
   (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
-	       [nil 0 compose-gstring-for-graphic])))
+	       )))
     (map-char-table
      #'(lambda (key val)
 	 (if (memq val '(Mn Mc Me))

Although the following is less invasive:

diff --git a/lisp/composite.el b/lisp/composite.el
index fb8b76114f4..333428f008a 100644
--- a/lisp/composite.el
+++ b/lisp/composite.el
@@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
 	 (if (memq val '(Mn Mc Me))
 	     (set-char-table-range composition-function-table key elt)))
      unicode-category-table))
+  ;; for Emoji presentation selector
+  (set-char-table-range
+   composition-function-table
+   #xFE0F
+    `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
   ;; for dotted-circle
   (aset composition-function-table #x25CC
 	`([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))

Didnʼt we conclude that composition had some issues with multiple
entries for the same codepoint if there was a mix for forward and
backward looking regexp?

Robert
-- 





^ permalink raw reply related	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26  8:34   ` Robert Pluim
@ 2023-05-26  8:46     ` Eli Zaretskii
  2023-05-26 11:14       ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-26  8:46 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: Steven Allen <steven@stebalien.com>,  63731@debbugs.gnu.org
> Date: Fri, 26 May 2023 10:34:02 +0200
> 
> Ugh. The following fixes it for me:
> 
> diff --git a/lisp/composite.el b/lisp/composite.el
> index fb8b76114f4..af86d1436d3 100644
> --- a/lisp/composite.el
> +++ b/lisp/composite.el
> @@ -756,7 +756,7 @@ compose-gstring-for-dotted-circle
>  ;; Allow for bootstrapping without uni-*.el.
>  (when unicode-category-table
>    (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
> -	       [nil 0 compose-gstring-for-graphic])))
> +	       )))

This is unacceptable, AFAIU.  We cannot unsupported (or change) the
correct display of mark characters, can we?

> Although the following is less invasive:
> 
> diff --git a/lisp/composite.el b/lisp/composite.el
> index fb8b76114f4..333428f008a 100644
> --- a/lisp/composite.el
> +++ b/lisp/composite.el
> @@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
>  	 (if (memq val '(Mn Mc Me))
>  	     (set-char-table-range composition-function-table key elt)))
>       unicode-category-table))
> +  ;; for Emoji presentation selector
> +  (set-char-table-range
> +   composition-function-table
> +   #xFE0F
> +    `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
>    ;; for dotted-circle
>    (aset composition-function-table #x25CC
>  	`([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))

Can you please explain why the current setup doesn't work in this
case, even though "C-u C-x =" says the composition was done?  And how
the above patch fixes that?

> Didnʼt we conclude that composition had some issues with multiple
> entries for the same codepoint if there was a mix for forward and
> backward looking regexp?

Not sure I understand to what does this allude.  What mix of forward
and backward looking regexp do you see?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26  8:46     ` Eli Zaretskii
@ 2023-05-26 11:14       ` Robert Pluim
  2023-05-26 12:06         ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-26 11:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Fri, 26 May 2023 11:46:05 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: Steven Allen <steven@stebalien.com>,  63731@debbugs.gnu.org
    >> Date: Fri, 26 May 2023 10:34:02 +0200
    >> 
    >> Ugh. The following fixes it for me:
    >> 
    >> diff --git a/lisp/composite.el b/lisp/composite.el
    >> index fb8b76114f4..af86d1436d3 100644
    >> --- a/lisp/composite.el
    >> +++ b/lisp/composite.el
    >> @@ -756,7 +756,7 @@ compose-gstring-for-dotted-circle
    >> ;; Allow for bootstrapping without uni-*.el.
    >> (when unicode-category-table
    >> (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
    >> -	       [nil 0 compose-gstring-for-graphic])))
    >> +	       )))

    Eli> This is unacceptable, AFAIU.  We cannot unsupported (or change) the
    Eli> correct display of mark characters, can we?

Right. Iʼll hold off pushing it 😃

    >> Although the following is less invasive:
    >> 
    >> diff --git a/lisp/composite.el b/lisp/composite.el
    >> index fb8b76114f4..333428f008a 100644
    >> --- a/lisp/composite.el
    >> +++ b/lisp/composite.el
    >> @@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
    >> (if (memq val '(Mn Mc Me))
    >> (set-char-table-range composition-function-table key elt)))
    >> unicode-category-table))
    >> +  ;; for Emoji presentation selector
    >> +  (set-char-table-range
    >> +   composition-function-table
    >> +   #xFE0F
    >> +    `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
    >> ;; for dotted-circle
    >> (aset composition-function-table #x25CC
    >> `([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))

    Eli> Can you please explain why the current setup doesn't work in this
    Eli> case, even though "C-u C-x =" says the composition was done?  And how
    Eli> the above patch fixes that?

Composition is done for 1f44d+fe0f, but I suspect that with the current
setup, composition is called again for FE0F, which results in the box
glyph. With the second patch we will only do backwards looking composition
for FE0F

    >> Didnʼt we conclude that composition had some issues with multiple
    >> entries for the same codepoint if there was a mix for forward and
    >> backward looking regexp?

    Eli> Not sure I understand to what does this allude.  What mix of forward
    Eli> and backward looking regexp do you see?

Youʼre right, thereʼs no forward looking regexp, only a backwards one
and a no-regexp. But itʼs undeniable that:

 [nil 0 compose-gstring-for-graphic]

causes the issue. Iʼve never been clear on the semantics of that.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 11:14       ` Robert Pluim
@ 2023-05-26 12:06         ` Eli Zaretskii
  2023-05-26 14:02           ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-26 12:06 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: steven@stebalien.com,  63731@debbugs.gnu.org
> Date: Fri, 26 May 2023 13:14:27 +0200
> 
>     >> Although the following is less invasive:
>     >> 
>     >> diff --git a/lisp/composite.el b/lisp/composite.el
>     >> index fb8b76114f4..333428f008a 100644
>     >> --- a/lisp/composite.el
>     >> +++ b/lisp/composite.el
>     >> @@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
>     >> (if (memq val '(Mn Mc Me))
>     >> (set-char-table-range composition-function-table key elt)))
>     >> unicode-category-table))
>     >> +  ;; for Emoji presentation selector
>     >> +  (set-char-table-range
>     >> +   composition-function-table
>     >> +   #xFE0F
>     >> +    `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
>     >> ;; for dotted-circle
>     >> (aset composition-function-table #x25CC
>     >> `([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
> 
>     Eli> Can you please explain why the current setup doesn't work in this
>     Eli> case, even though "C-u C-x =" says the composition was done?  And how
>     Eli> the above patch fixes that?
> 
> Composition is done for 1f44d+fe0f, but I suspect that with the current
> setup, composition is called again for FE0F, which results in the box
> glyph. With the second patch we will only do backwards looking composition
> for FE0F

OK, then I think we should install this on the emacs-29 branch.

> Youʼre right, thereʼs no forward looking regexp, only a backwards one
> and a no-regexp. But itʼs undeniable that:
> 
>  [nil 0 compose-gstring-for-graphic]
> 
> causes the issue. Iʼve never been clear on the semantics of that.

It has special support in compose-gstring-for-graphic, see there.  The
doc string also says a few words about that.  We use this, e.g., in
describe-char display, where we sometimes need to show a single
combining character with no base character to combine it with.  I
think this is only relevant for accents and other such combining
characters, not for VS-n.

What does this issue mean for the other VS-n characters, though?
Should we perhaps install something similar for them as well?

Thanks.





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 12:06         ` Eli Zaretskii
@ 2023-05-26 14:02           ` Robert Pluim
  2023-05-26 14:55             ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-26 14:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Fri, 26 May 2023 15:06:40 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> 
    >> Composition is done for 1f44d+fe0f, but I suspect that with the current
    >> setup, composition is called again for FE0F, which results in the box
    >> glyph. With the second patch we will only do backwards looking composition
    >> for FE0F

    Eli> OK, then I think we should install this on the emacs-29 branch.

    >> Youʼre right, thereʼs no forward looking regexp, only a backwards one
    >> and a no-regexp. But itʼs undeniable that:
    >> 
    >> [nil 0 compose-gstring-for-graphic]
    >> 
    >> causes the issue. Iʼve never been clear on the semantics of that.

    Eli> It has special support in compose-gstring-for-graphic, see there.  The
    Eli> doc string also says a few words about that.  We use this, e.g., in
    Eli> describe-char display, where we sometimes need to show a single
    Eli> combining character with no base character to combine it with.  I
    Eli> think this is only relevant for accents and other such combining
    Eli> characters, not for VS-n.

OK

    Eli> What does this issue mean for the other VS-n characters, though?
    Eli> Should we perhaps install something similar for them as well?

For VS-15 maybe? The following gets me text-presentation composition
with CHAR+FE0E and emoji-presentation with CHAR+FE0F

diff --git a/lisp/composite.el b/lisp/composite.el
index fb8b76114f4..ada35010146 100644
--- a/lisp/composite.el
+++ b/lisp/composite.el
@@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
 	 (if (memq val '(Mn Mc Me))
 	     (set-char-table-range composition-function-table key elt)))
      unicode-category-table))
+  ;; for Emoji presentation selector
+  (set-char-table-range
+   composition-function-table
+   '(#xFE0E . #xFE0F)
+    `([,(purecopy "\\c.[\ufe0f\ufe0e]") 1 compose-gstring-for-graphic]))
   ;; for dotted-circle
   (aset composition-function-table #x25CC
 	`([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
@@ -861,7 +866,7 @@ compose-gstring-for-variation-glyph
 ;; handled in font_range, we end up choosing the Emoji presentation
 ;; rather than the Text presentation.
 (let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
-  (set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
+  (set-char-table-range composition-function-table '(#xFE00 . #xFE0D) elt)
   (set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))
 
 (defun auto-compose-chars (func from to font-object string direction)

although perhaps we could have both `compose-gstring-for-graphic' and
`compose-gstring-for-variation-glyph' for FE0E

Robert
-- 





^ permalink raw reply related	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 14:02           ` Robert Pluim
@ 2023-05-26 14:55             ` Eli Zaretskii
  2023-05-26 15:25               ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-26 14:55 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: steven@stebalien.com,  63731@debbugs.gnu.org
> Date: Fri, 26 May 2023 16:02:40 +0200
> 
>     Eli> What does this issue mean for the other VS-n characters, though?
>     Eli> Should we perhaps install something similar for them as well?
> 
> For VS-15 maybe? The following gets me text-presentation composition
> with CHAR+FE0E and emoji-presentation with CHAR+FE0F

Actually, I forgot about compose-gstring-for-variation-glyph.  My
question was actually whether the general setting in

  (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
	       [nil 0 compose-gstring-for-graphic])))
    (map-char-table
     #'(lambda (key val)
	 (if (memq val '(Mn Mc Me))
	     (set-char-table-range composition-function-table key elt)))
     unicode-category-table))

affects also the VS-n selectors.  But since the latter setting of

  (let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
    (set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
    (set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))

takes care of all the VS-n selectors except VS-16, and your patch now
will take care of VS-16, it sounds like we don't need to care about
other VS-n selectors?

Or are you saying that without including VS-15, CHAR+FE0E is not
displayed using its text representation?

Did you test the proposed change with the admin/emoji-*.txt files, to
make sure they all still display OK?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26  6:41 ` Eli Zaretskii
  2023-05-26  8:34   ` Robert Pluim
@ 2023-05-26 15:06   ` Steven Allen
  2023-05-26 15:29     ` Robert Pluim
  1 sibling, 1 reply; 61+ messages in thread
From: Steven Allen @ 2023-05-26 15:06 UTC (permalink / raw)
  To: Eli Zaretskii, Robert Pluim; +Cc: 63731


Eli Zaretskii <eliz@gnu.org> writes:
> AFAIU, this change will populate composition-function-table for many
> "normal" characters, including ASCII digits and symbol/punctuation
> characters from the 0x2xxx blocks.  E.g., after you build Emacs with
> this patch, what do the following evaluations yield:
>
>   M-: (aref composition-function-table ?0) RET
>   M-: (aref composition-function-table #x2122) RET
>
> If they yield non-nil values, it could mean dramatic slowdown of
> redisplay with these characters.

Both of these yield nil with this patch applied (and I haven't noticed
any performance regressions). But it looks like you and Robert have a
better patch so I'll leave you to it.

However, I'd like to draw your attention to the existing hard-coded
VS-16 table here:

https://git.savannah.gnu.org/cgit/emacs.git/tree/admin/unidata/emoji-zwj.awk?h=4b3de748b0b04407d2492500c77905de56de1180#n72

It feels like this should either be the full table (the one in the
patch) or it shouldn't exist at all. But again, I'm not the expert here.





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 14:55             ` Eli Zaretskii
@ 2023-05-26 15:25               ` Robert Pluim
  2023-05-26 15:52                 ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-26 15:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Fri, 26 May 2023 17:55:26 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: steven@stebalien.com,  63731@debbugs.gnu.org
    >> Date: Fri, 26 May 2023 16:02:40 +0200
    >> 
    Eli> What does this issue mean for the other VS-n characters, though?
    Eli> Should we perhaps install something similar for them as well?
    >> 
    >> For VS-15 maybe? The following gets me text-presentation composition
    >> with CHAR+FE0E and emoji-presentation with CHAR+FE0F

    Eli> Actually, I forgot about compose-gstring-for-variation-glyph.  My
    Eli> question was actually whether the general setting in

    Eli>   (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
    Eli> 	       [nil 0 compose-gstring-for-graphic])))
    Eli>     (map-char-table
    Eli>      #'(lambda (key val)
    Eli> 	 (if (memq val '(Mn Mc Me))
    Eli> 	     (set-char-table-range composition-function-table key elt)))
    Eli>      unicode-category-table))

    Eli> affects also the VS-n selectors.  But since the latter setting of

    Eli>   (let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
    Eli>     (set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
    Eli>     (set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))

    Eli> takes care of all the VS-n selectors except VS-16, and your patch now
    Eli> will take care of VS-16, it sounds like we don't need to care about
    Eli> other VS-n selectors?

    Eli> Or are you saying that without including VS-15, CHAR+FE0E is not
    Eli> displayed using its text representation?

Not quite. If I donʼt have compose-gstring-for-graphic for VS-15, no
composition occurs for CHAR+FE0E. With my change youʼll get
composition, but itʼs still not 100% correct: CHAR+FE0E when CHAR is a
member of the emoji script will use emoji presentation, not text, but
the extra empty box will not show, so itʼs still an improvement.

    Eli> Did you test the proposed change with the admin/emoji-*.txt files, to
    Eli> make sure they all still display OK?

Yes. Iʼve also got a change that makes Emoji_Keycap_Sequence work, but
I think we can leave that for master.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 15:06   ` Steven Allen
@ 2023-05-26 15:29     ` Robert Pluim
  2023-05-26 16:03       ` Steven Allen
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-26 15:29 UTC (permalink / raw)
  To: Steven Allen; +Cc: Eli Zaretskii, 63731

>>>>> On Fri, 26 May 2023 08:06:11 -0700, Steven Allen <steven@stebalien.com> said:

    Steven> Eli Zaretskii <eliz@gnu.org> writes:
    >> AFAIU, this change will populate composition-function-table for many
    >> "normal" characters, including ASCII digits and symbol/punctuation
    >> characters from the 0x2xxx blocks.  E.g., after you build Emacs with
    >> this patch, what do the following evaluations yield:
    >> 
    >> M-: (aref composition-function-table ?0) RET
    >> M-: (aref composition-function-table #x2122) RET
    >> 
    >> If they yield non-nil values, it could mean dramatic slowdown of
    >> redisplay with these characters.

    Steven> Both of these yield nil with this patch applied (and I haven't noticed
    Steven> any performance regressions). But it looks like you and Robert have a
    Steven> better patch so I'll leave you to it.

Itʼs smaller, thatʼs for sure. And it will definitely be faster.

    Steven> However, I'd like to draw your attention to the existing hard-coded
    Steven> VS-16 table here:

    Steven> https://git.savannah.gnu.org/cgit/emacs.git/tree/admin/unidata/emoji-zwj.awk?h=4b3de748b0b04407d2492500c77905de56de1180#n72

    Steven> It feels like this should either be the full table (the one in the
    Steven> patch) or it shouldn't exist at all. But again, I'm not the expert here.

Welcome to the wonderful world of Unicode. The reason the table exists
is that there are codepoints that are *not* emoji, but theyʼre part of
emoji sequences, so we still need to treat them as emoji in some
situations. Why Unicode didnʼt just make them emoji I donʼt know.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 15:25               ` Robert Pluim
@ 2023-05-26 15:52                 ` Eli Zaretskii
  2023-05-26 16:24                   ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-26 15:52 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: steven@stebalien.com,  63731@debbugs.gnu.org
> Date: Fri, 26 May 2023 17:25:24 +0200
> 
>     Eli> Or are you saying that without including VS-15, CHAR+FE0E is not
>     Eli> displayed using its text representation?
> 
> Not quite. If I donʼt have compose-gstring-for-graphic for VS-15, no
> composition occurs for CHAR+FE0E. With my change youʼll get
> composition, but itʼs still not 100% correct: CHAR+FE0E when CHAR is a
> member of the emoji script will use emoji presentation, not text, but
> the extra empty box will not show, so itʼs still an improvement.

OK.  And what about CHAR+FE0E when CHAR is not an Emoji?

Anyway, I think you should install the patch on emacs-29, and we
should then try to fix the text-representation bug with VS-15 on
master.  (I guess it requires a change to font.c or something?)

>     Eli> Did you test the proposed change with the admin/emoji-*.txt files, to
>     Eli> make sure they all still display OK?
> 
> Yes. Iʼve also got a change that makes Emoji_Keycap_Sequence work, but
> I think we can leave that for master.

Depends on the solution, I guess.  Isn't it just a change to the
VS-16's entry in composition-function-table?  Or maybe a change in the
#x20e3's entry?  (Did we discus the Emoji_Keycap_Sequence case before?)





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 15:29     ` Robert Pluim
@ 2023-05-26 16:03       ` Steven Allen
  0 siblings, 0 replies; 61+ messages in thread
From: Steven Allen @ 2023-05-26 16:03 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Eli Zaretskii, 63731


Robert Pluim <rpluim@gmail.com> writes:
> Welcome to the wonderful world of Unicode. The reason the table exists
> is that there are codepoints that are *not* emoji, but theyʼre part of
> emoji sequences, so we still need to treat them as emoji in some
> situations. Why Unicode didnʼt just make them emoji I donʼt know.

Got it... It sounds like the "correct" solution is to download the full
list (emoji-variation-sequences.txt) and filter for non-emoji
characters, but I guess that's overkill.

Thanks!





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 15:52                 ` Eli Zaretskii
@ 2023-05-26 16:24                   ` Robert Pluim
  2023-05-26 17:27                     ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-26 16:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Fri, 26 May 2023 18:52:22 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: steven@stebalien.com,  63731@debbugs.gnu.org
    >> Date: Fri, 26 May 2023 17:25:24 +0200
    >> 
    Eli> Or are you saying that without including VS-15, CHAR+FE0E is not
    Eli> displayed using its text representation?
    >> 
    >> Not quite. If I donʼt have compose-gstring-for-graphic for VS-15, no
    >> composition occurs for CHAR+FE0E. With my change youʼll get
    >> composition, but itʼs still not 100% correct: CHAR+FE0E when CHAR is a
    >> member of the emoji script will use emoji presentation, not text, but
    >> the extra empty box will not show, so itʼs still an improvement.

    Eli> OK.  And what about CHAR+FE0E when CHAR is not an Emoji?

Then you get the (composed) text presentation (and the composed emoji
presentation when itʼs CHAR+FE0F).

    Eli> Anyway, I think you should install the patch on emacs-29, and we
    Eli> should then try to fix the text-representation bug with VS-15 on
    Eli> master.  (I guess it requires a change to font.c or something?)

It requires something that answers the question "what font would we
use for this codepoint if it was not an emoji?". Maybe we can have a
separate fontset that pretends that the emoji script is equivalent to
symbol? Or invent some kind of 'text-presentation-font' property to
put somewhere?

    Eli> Did you test the proposed change with the admin/emoji-*.txt files, to
    Eli> make sure they all still display OK?
    >> 
    >> Yes. Iʼve also got a change that makes Emoji_Keycap_Sequence work, but
    >> I think we can leave that for master.

    Eli> Depends on the solution, I guess.  Isn't it just a change to the
    Eli> VS-16's entry in composition-function-table?  Or maybe a change in the
    Eli> #x20e3's entry?  (Did we discus the Emoji_Keycap_Sequence case before?)

Itʼs a change to the VS-16 entry. We did discuss it before, and
decided to put it aside because the solutions all involved adding
composition-function-table entries for 0-9 or similar. I donʼt
remember why we didnʼt consider adding to VS-16ʼs entry.

Iʼll do some more testing, and post a final version hopefully this
weekend sometime.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 16:24                   ` Robert Pluim
@ 2023-05-26 17:27                     ` Eli Zaretskii
  2023-05-26 17:35                       ` Robert Pluim
                                         ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-26 17:27 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Fri, 26 May 2023 18:24:02 +0200
> 
> >>>>> On Fri, 26 May 2023 18:52:22 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     Eli> Anyway, I think you should install the patch on emacs-29, and we
>     Eli> should then try to fix the text-representation bug with VS-15 on
>     Eli> master.  (I guess it requires a change to font.c or something?)
> 
> It requires something that answers the question "what font would we
> use for this codepoint if it was not an emoji?". Maybe we can have a
> separate fontset that pretends that the emoji script is equivalent to
> symbol? Or invent some kind of 'text-presentation-font' property to
> put somewhere?

I'm not sure I understand why we don't select the right font by
default.  Selecting a non-Emoji font for a non-Emoji codepoints should
not need any special tricks.

>     >> Yes. Iʼve also got a change that makes Emoji_Keycap_Sequence work, but
>     >> I think we can leave that for master.
> 
>     Eli> Depends on the solution, I guess.  Isn't it just a change to the
>     Eli> VS-16's entry in composition-function-table?  Or maybe a change in the
>     Eli> #x20e3's entry?  (Did we discus the Emoji_Keycap_Sequence case before?)
> 
> Itʼs a change to the VS-16 entry. We did discuss it before, and
> decided to put it aside because the solutions all involved adding
> composition-function-table entries for 0-9 or similar. I donʼt
> remember why we didnʼt consider adding to VS-16ʼs entry.
> 
> Iʼll do some more testing, and post a final version hopefully this
> weekend sometime.

OK, thanks.





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 17:27                     ` Eli Zaretskii
@ 2023-05-26 17:35                       ` Robert Pluim
  2023-05-26 18:05                         ` Eli Zaretskii
  2023-05-26 17:43                       ` Eli Zaretskii
  2023-05-28 11:57                       ` Robert Pluim
  2 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-26 17:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Fri, 26 May 2023 20:27:26 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Fri, 26 May 2023 18:24:02 +0200
    >> 
    >> >>>>> On Fri, 26 May 2023 18:52:22 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >> 
    Eli> Anyway, I think you should install the patch on emacs-29, and we
    Eli> should then try to fix the text-representation bug with VS-15 on
    Eli> master.  (I guess it requires a change to font.c or something?)
    >> 
    >> It requires something that answers the question "what font would we
    >> use for this codepoint if it was not an emoji?". Maybe we can have a
    >> separate fontset that pretends that the emoji script is equivalent to
    >> symbol? Or invent some kind of 'text-presentation-font' property to
    >> put somewhere?

    Eli> I'm not sure I understand why we don't select the right font by
    Eli> default.  Selecting a non-Emoji font for a non-Emoji codepoints should
    Eli> not need any special tricks.

It doesnʼt but in this case it *is* an emoji codepoint, so it displays
as emoji because of font.c, even when followed by VS-15.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 17:27                     ` Eli Zaretskii
  2023-05-26 17:35                       ` Robert Pluim
@ 2023-05-26 17:43                       ` Eli Zaretskii
  2023-05-28 10:29                         ` Robert Pluim
  2023-05-28 11:57                       ` Robert Pluim
  2 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-26 17:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 63731, steven

> Cc: 63731@debbugs.gnu.org, steven@stebalien.com
> Date: Fri, 26 May 2023 20:27:26 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > From: Robert Pluim <rpluim@gmail.com>
> > Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> > Date: Fri, 26 May 2023 18:24:02 +0200
> > 
> > It requires something that answers the question "what font would we
> > use for this codepoint if it was not an emoji?". Maybe we can have a
> > separate fontset that pretends that the emoji script is equivalent to
> > symbol? Or invent some kind of 'text-presentation-font' property to
> > put somewhere?
> 
> I'm not sure I understand why we don't select the right font by
> default.  Selecting a non-Emoji font for a non-Emoji codepoints should
> not need any special tricks.

Actually, I don't understand why there's an issue here with font
selection.  Are you saying that using Noto Color Emoji with
CHAR+0xFE0E, when CHAR is an Emoji character, doesn't produce the
textual representation of CHAR?  If so, isn't that a problem with the
font?  I thought all we needed to do was to hand the combination to an
Emoji-aware font, and the font would do the rest.  Now you seem to be
saying that we somehow need to select a non-Emoji font?  But if so,
who'd guarantee that a font that cannot display Emoji will know what
to do with the combination CHAR+0xFE0E?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 17:35                       ` Robert Pluim
@ 2023-05-26 18:05                         ` Eli Zaretskii
  2023-05-28 11:43                           ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-26 18:05 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Fri, 26 May 2023 19:35:56 +0200
> 
> in this case it *is* an emoji codepoint, so it displays
> as emoji because of font.c, even when followed by VS-15.

If we pass to an Emoji-capable font a sequence of a character followed
by VS-15, I'd expect the font to produce a glyph with the textual
representation of that character.





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 17:43                       ` Eli Zaretskii
@ 2023-05-28 10:29                         ` Robert Pluim
  2023-05-28 12:37                           ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-28 10:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Fri, 26 May 2023 20:43:37 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> Cc: 63731@debbugs.gnu.org, steven@stebalien.com
    >> Date: Fri, 26 May 2023 20:27:26 +0300
    >> From: Eli Zaretskii <eliz@gnu.org>
    >> 
    >> > From: Robert Pluim <rpluim@gmail.com>
    >> > Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> > Date: Fri, 26 May 2023 18:24:02 +0200
    >> > 
    >> > It requires something that answers the question "what font would we
    >> > use for this codepoint if it was not an emoji?". Maybe we can have a
    >> > separate fontset that pretends that the emoji script is equivalent to
    >> > symbol? Or invent some kind of 'text-presentation-font' property to
    >> > put somewhere?
    >> 
    >> I'm not sure I understand why we don't select the right font by
    >> default.  Selecting a non-Emoji font for a non-Emoji codepoints should
    >> not need any special tricks.

    Eli> Actually, I don't understand why there's an issue here with font
    Eli> selection.  Are you saying that using Noto Color Emoji with
    Eli> CHAR+0xFE0E, when CHAR is an Emoji character, doesn't produce the
    Eli> textual representation of CHAR?  If so, isn't that a problem with the
    Eli> font?  I thought all we needed to do was to hand the combination to an
    Eli> Emoji-aware font, and the font would do the rest.  Now you seem to be
    Eli> saying that we somehow need to select a non-Emoji font?  But if so,
    Eli> who'd guarantee that a font that cannot display Emoji will know what
    Eli> to do with the combination CHAR+0xFE0E?

Iʼm not sure: gedit displays the text representation, and libreoffice
displays the emoji presentation. And the google color emoji website
only shows colour glyphs. So I think itʼs up to the application to
select the correct font.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 18:05                         ` Eli Zaretskii
@ 2023-05-28 11:43                           ` Robert Pluim
  2023-05-28 12:44                             ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-28 11:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Fri, 26 May 2023 21:05:47 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Fri, 26 May 2023 19:35:56 +0200
    >> 
    >> in this case it *is* an emoji codepoint, so it displays
    >> as emoji because of font.c, even when followed by VS-15.

    Eli> If we pass to an Emoji-capable font a sequence of a character followed
    Eli> by VS-15, I'd expect the font to produce a glyph with the textual
    Eli> representation of that character.

But we donʼt do that: we ask the font "give me a glyph for this codepoint".

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-26 17:27                     ` Eli Zaretskii
  2023-05-26 17:35                       ` Robert Pluim
  2023-05-26 17:43                       ` Eli Zaretskii
@ 2023-05-28 11:57                       ` Robert Pluim
  2023-05-28 12:47                         ` Eli Zaretskii
  2 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-28 11:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Fri, 26 May 2023 20:27:26 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >> 
    >> Itʼs a change to the VS-16 entry. We did discuss it before, and
    >> decided to put it aside because the solutions all involved adding
    >> composition-function-table entries for 0-9 or similar. I donʼt
    >> remember why we didnʼt consider adding to VS-16ʼs entry.
    >> 
    >> Iʼll do some more testing, and post a final version hopefully this
    >> weekend sometime.

    Eli> OK, thanks.

Eli, if the 20e3 changes are too much for emacs-29, I can put them in
master.

Iʼll put some notes in admin/notes/unicode as well.

diff --git c/admin/unidata/emoji-zwj.awk i/admin/unidata/emoji-zwj.awk
index 7d2ff6cb900..0b6f1267205 100644
--- c/admin/unidata/emoji-zwj.awk
+++ i/admin/unidata/emoji-zwj.awk
@@ -82,6 +82,7 @@ END {
      trigger_codepoints[11] = "1F574"
      trigger_codepoints[12] = "1F575"
      trigger_codepoints[13] = "1F590"
+     trigger_codepoints[14] = "20E3"
 
      printf "(setq auto-composition-emoji-eligible-codepoints\n"
      printf "'("
diff --git c/lisp/composite.el i/lisp/composite.el
index fb8b76114f4..acba4e73c17 100644
--- c/lisp/composite.el
+++ i/lisp/composite.el
@@ -762,6 +762,23 @@ compose-gstring-for-dotted-circle
 	 (if (memq val '(Mn Mc Me))
 	     (set-char-table-range composition-function-table key elt)))
      unicode-category-table))
+  ;; for Emoji presentation selector
+  ;; We don't want the generic nil 0 entry because it causes display
+  ;; of an extra box for FE0F.  (Bug#63731)
+  ;; This also covers the fully-qualified enclosing keycap case.
+  (set-char-table-range
+   composition-function-table
+   #xFE0E
+   `([,(purecopy "\\c.\ufe0e") 1 compose-gstring-for-graphic]))
+  (set-char-table-range
+   composition-function-table
+   #xFE0F
+   `([,(purecopy "\\c.\ufe0f\u20e3?") 1 compose-gstring-for-graphic]))
+  ;; for unqualified enclosing keycap
+  (set-char-table-range
+   composition-function-table
+   #x20E3
+   `([,(purecopy "[#*0-9]\u20e3") 1 compose-gstring-for-graphic]))
   ;; for dotted-circle
   (aset composition-function-table #x25CC
 	`([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
@@ -857,11 +874,10 @@ compose-gstring-for-variation-glyph
 ;; taken care of by font_range in font.c, which will check for an
 ;; emoji font for codepoints used in compositions even if they're not
 ;; emoji themselves, and thus choose the Emoji presentation for them
-;; when followed by VS-16.  VS-15 *is* handled here, because if it's
-;; handled in font_range, we end up choosing the Emoji presentation
-;; rather than the Text presentation.
+;; when followed by VS-16.  VS-15 is handled by the setup around
+;; unicode-category-table above.
 (let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
-  (set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
+  (set-char-table-range composition-function-table '(#xFE00 . #xFE0D) elt)
   (set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))
 
 (defun auto-compose-chars (func from to font-object string direction)



Robert
-- 





^ permalink raw reply related	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-28 10:29                         ` Robert Pluim
@ 2023-05-28 12:37                           ` Eli Zaretskii
  0 siblings, 0 replies; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-28 12:37 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Sun, 28 May 2023 12:29:48 +0200
> 
> >>>>> On Fri, 26 May 2023 20:43:37 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     Eli> Actually, I don't understand why there's an issue here with font
>     Eli> selection.  Are you saying that using Noto Color Emoji with
>     Eli> CHAR+0xFE0E, when CHAR is an Emoji character, doesn't produce the
>     Eli> textual representation of CHAR?  If so, isn't that a problem with the
>     Eli> font?  I thought all we needed to do was to hand the combination to an
>     Eli> Emoji-aware font, and the font would do the rest.  Now you seem to be
>     Eli> saying that we somehow need to select a non-Emoji font?  But if so,
>     Eli> who'd guarantee that a font that cannot display Emoji will know what
>     Eli> to do with the combination CHAR+0xFE0E?
> 
> Iʼm not sure: gedit displays the text representation, and libreoffice
> displays the emoji presentation. And the google color emoji website
> only shows colour glyphs. So I think itʼs up to the application to
> select the correct font.

But what is "the correct font", when the sequence of codepoints is
CHAR+0xFE0E?  How do we identify such a font?  Do you know of a font
that produces the correct glyph for this sequence, when HarfBuzz is
used as the shaping engine?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-28 11:43                           ` Robert Pluim
@ 2023-05-28 12:44                             ` Eli Zaretskii
  0 siblings, 0 replies; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-28 12:44 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Sun, 28 May 2023 13:43:13 +0200
> 
> >>>>> On Fri, 26 May 2023 21:05:47 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     >> From: Robert Pluim <rpluim@gmail.com>
>     >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
>     >> Date: Fri, 26 May 2023 19:35:56 +0200
>     >> 
>     >> in this case it *is* an emoji codepoint, so it displays
>     >> as emoji because of font.c, even when followed by VS-15.
> 
>     Eli> If we pass to an Emoji-capable font a sequence of a character followed
>     Eli> by VS-15, I'd expect the font to produce a glyph with the textual
>     Eli> representation of that character.
> 
> But we donʼt do that: we ask the font "give me a glyph for this codepoint".

Is that because of the composition-function-table's entry for VS-15?
Maybe we should augment that, then?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-28 11:57                       ` Robert Pluim
@ 2023-05-28 12:47                         ` Eli Zaretskii
  2023-05-29 10:44                           ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-28 12:47 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Sun, 28 May 2023 13:57:49 +0200
> 
> Eli, if the 20e3 changes are too much for emacs-29, I can put them in
> master.

Yeah, I think it should go to master for now.

Otherwise, LGTM, thanks.





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-28 12:47                         ` Eli Zaretskii
@ 2023-05-29 10:44                           ` Robert Pluim
  2023-05-29 13:58                             ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-29 10:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Sun, 28 May 2023 15:47:11 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Sun, 28 May 2023 13:57:49 +0200
    >> 
    >> Eli, if the 20e3 changes are too much for emacs-29, I can put them in
    >> master.

    Eli> Yeah, I think it should go to master for now.

I pushed the doc changes, but not the code changes, because I now
think theyʼre papering over a deeper bug (which weʼve noticed before,
but didnʼt fix then).

In all these cases, consider the sequence U+1F44D U+FE0F

- emacs-29:

    Displays as colour emoji, followed by an empty box

- emacs-29 with the following change in composite.el:

      (set-char-table-range
       composition-function-table
       #xFE0F
       `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))

    Displays as colour emoji. Much rejoicing. If I follow my own
    advice, and customize `glyphless-char-display-control' to show
    hex-boxes for variation selectors, you then see that in actual
    fact, we are still displaying the FE0F, but since it uses
    thin-space by default, it wasnʼt obvious. Much sadness.

    C-u C-x =:

                  display: composed to form "👍️" (see below)

    Composed with the following character(s) "️" using this font:
      ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-13-*-*-*-m-0-iso10646-1
    by these glyphs:
      [0 1 128077 569 16 0 17 13 4 nil]
    with these character(s):
      ️ (#xfe0f) VARIATION SELECTOR-16

Now I notice (via emoji-variation-sequences.txt), that this is only
happening for the following codepoints.

   U+1F408
   U+1F415
   U+1F426
   U+1F446
   U+1F447
   U+1F448
   U+1F449
   U+1F44D
   U+1F44E

And if I look in lisp/international/emoji-zwj.el, I find:

(#x1F44D .
,(eval-when-compile (regexp-opt
'(
"\N{U+1F44D}\N{U+1F3FB}"
"\N{U+1F44D}\N{U+1F3FC}"
"\N{U+1F44D}\N{U+1F3FD}"
"\N{U+1F44D}\N{U+1F3FE}"
"\N{U+1F44D}\N{U+1F3FF}"
))))

If I add

"\N{U+1F44D}\N{U+FE0F}"

to that, and undo the composite.el change, then everything is
fine. Hurrah! This means that the

`([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
	       [nil 0 compose-gstring-for-graphic])

is not doing the right thing for this case.

I can change the emoji-zwj.awk script to add CHAR+FE0F for all emoji,
unless someone knows how to fix composition to do the right thing
here.

(there are similar issues with CHAR+FE0E)

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-29 10:44                           ` Robert Pluim
@ 2023-05-29 13:58                             ` Eli Zaretskii
  2023-05-29 14:43                               ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-29 13:58 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Mon, 29 May 2023 12:44:58 +0200
> 
> In all these cases, consider the sequence U+1F44D U+FE0F
> 
> - emacs-29:
> 
>     Displays as colour emoji, followed by an empty box
> 
> - emacs-29 with the following change in composite.el:
> 
>       (set-char-table-range
>        composition-function-table
>        #xFE0F
>        `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
> 
>     Displays as colour emoji. Much rejoicing. If I follow my own
>     advice, and customize `glyphless-char-display-control' to show
>     hex-boxes for variation selectors, you then see that in actual
>     fact, we are still displaying the FE0F, but since it uses
>     thin-space by default, it wasnʼt obvious. Much sadness.
> 
>     C-u C-x =:
> 
>                   display: composed to form "👍️" (see below)

This is not what I see.  I didn't use the above set-char-table-range
expression literally, but instead started "emacs -Q", and then
evaluated in *scratch*:

      (set-char-table-range
       composition-function-table
       #xFE0F
       '(["\\c.\ufe0f" 1 compose-gstring-for-graphic]))

After that, the sequence U+1F44D U+FE0F displays as a single glyph,
and there's no thin space after it.  What am I missing?  Is this
somehow specific to ftcrhb font driver or something?

> Now I notice (via emoji-variation-sequences.txt), that this is only
> happening for the following codepoints.
> 
>    U+1F408
>    U+1F415
>    U+1F426
>    U+1F446
>    U+1F447
>    U+1F448
>    U+1F449
>    U+1F44D
>    U+1F44E
> 
> And if I look in lisp/international/emoji-zwj.el, I find:
> 
> (#x1F44D .
> ,(eval-when-compile (regexp-opt
> '(
> "\N{U+1F44D}\N{U+1F3FB}"
> "\N{U+1F44D}\N{U+1F3FC}"
> "\N{U+1F44D}\N{U+1F3FD}"
> "\N{U+1F44D}\N{U+1F3FE}"
> "\N{U+1F44D}\N{U+1F3FF}"
> ))))
> 
> If I add
> 
> "\N{U+1F44D}\N{U+FE0F}"
> 
> to that, and undo the composite.el change, then everything is
> fine. Hurrah! This means that the
> 
> `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
> 	       [nil 0 compose-gstring-for-graphic])
> 
> is not doing the right thing for this case.

You are saying that the entry in composition-function-table for
U+1F44D (and other similar characters) is used in preference to the
entry for U+FE0F that follows it, even though there's no U+1F3FB
etc. after it to "steal" the composition?  Did you try stepping
through composite.c to see whether and why this is the case?

> I can change the emoji-zwj.awk script to add CHAR+FE0F for all emoji,
> unless someone knows how to fix composition to do the right thing
> here.

I think we need first to understand the issue at hand better.  There's
more here than meets the eye, I think.





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-29 13:58                             ` Eli Zaretskii
@ 2023-05-29 14:43                               ` Robert Pluim
  2023-05-29 14:55                                 ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-29 14:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 29 May 2023 16:58:43 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> display: composed to form "👍️" (see below)

    Eli> This is not what I see.  I didn't use the above set-char-table-range
    Eli> expression literally, but instead started "emacs -Q", and then
    Eli> evaluated in *scratch*:

    Eli>       (set-char-table-range
    Eli>        composition-function-table
    Eli>        #xFE0F
    Eli>        '(["\\c.\ufe0f" 1 compose-gstring-for-graphic]))

    Eli> After that, the sequence U+1F44D U+FE0F displays as a single glyph,
    Eli> and there's no thin space after it.  What am I missing?  Is this
    Eli> somehow specific to ftcrhb font driver or something?

Itʼs a single glyph, but that glyph contains a thin-space. I used this
to check, the second 'a' is slightly offset

👍️a
👍a

This persists if I disable harfbuzz, and it behaves the same on macOS

    Eli> You are saying that the entry in composition-function-table for
    Eli> U+1F44D (and other similar characters) is used in preference to the
    Eli> entry for U+FE0F that follows it, even though there's no U+1F3FB
    Eli> etc. after it to "steal" the composition?  Did you try stepping
    Eli> through composite.c to see whether and why this is the case?

Right. It looks the the FE0F entry is ignored. Iʼve not ventured into
composite.c yet.

    >> I can change the emoji-zwj.awk script to add CHAR+FE0F for all emoji,
    >> unless someone knows how to fix composition to do the right thing
    >> here.

    Eli> I think we need first to understand the issue at hand better.  There's
    Eli> more here than meets the eye, I think.

Absolutely

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-29 14:43                               ` Robert Pluim
@ 2023-05-29 14:55                                 ` Eli Zaretskii
  2023-05-29 16:13                                   ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-29 14:55 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Mon, 29 May 2023 16:43:00 +0200
> 
> >>>>> On Mon, 29 May 2023 16:58:43 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     >> display: composed to form "👍️" (see below)
> 
>     Eli> This is not what I see.  I didn't use the above set-char-table-range
>     Eli> expression literally, but instead started "emacs -Q", and then
>     Eli> evaluated in *scratch*:
> 
>     Eli>       (set-char-table-range
>     Eli>        composition-function-table
>     Eli>        #xFE0F
>     Eli>        '(["\\c.\ufe0f" 1 compose-gstring-for-graphic]))
> 
>     Eli> After that, the sequence U+1F44D U+FE0F displays as a single glyph,
>     Eli> and there's no thin space after it.  What am I missing?  Is this
>     Eli> somehow specific to ftcrhb font driver or something?
> 
> Itʼs a single glyph, but that glyph contains a thin-space. I used this
> to check, the second 'a' is slightly offset
> 
> 👍️a
> 👍a

That's because the first one shows two glyphs that are
"pseudo-composed": not by the font, but by our hand-made "composition"
in compose-gstring-for-graphic.  Try this instead:

      (set-char-table-range
       composition-function-table
       #xFE0F
       '(["\\c.\ufe0f" 1 font-shape-gstring]))

so that we only see a composition if the font indeed agrees to
compose.  What do you see?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-29 14:55                                 ` Eli Zaretskii
@ 2023-05-29 16:13                                   ` Robert Pluim
  2023-05-29 17:18                                     ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-29 16:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 29 May 2023 17:55:49 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Mon, 29 May 2023 16:43:00 +0200
    >> 
    >> >>>>> On Mon, 29 May 2023 16:58:43 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >> 
    >> >> display: composed to form "👍️" (see below)
    >> 
    Eli> This is not what I see.  I didn't use the above set-char-table-range
    Eli> expression literally, but instead started "emacs -Q", and then
    Eli> evaluated in *scratch*:
    >> 
    Eli> (set-char-table-range
    Eli> composition-function-table
    Eli> #xFE0F
    Eli> '(["\\c.\ufe0f" 1 compose-gstring-for-graphic]))
    >> 
    Eli> After that, the sequence U+1F44D U+FE0F displays as a single glyph,
    Eli> and there's no thin space after it.  What am I missing?  Is this
    Eli> somehow specific to ftcrhb font driver or something?
    >> 
    >> Itʼs a single glyph, but that glyph contains a thin-space. I used this
    >> to check, the second 'a' is slightly offset
    >> 
    >> 👍️a
    >> 👍a

    Eli> That's because the first one shows two glyphs that are
    Eli> "pseudo-composed": not by the font, but by our hand-made "composition"
    Eli> in compose-gstring-for-graphic.  Try this instead:

    Eli>       (set-char-table-range
    Eli>        composition-function-table
    Eli>        #xFE0F
    Eli>        '(["\\c.\ufe0f" 1 font-shape-gstring]))

    Eli> so that we only see a composition if the font indeed agrees to
    Eli> compose.  What do you see?

It still displays a single glyph with a thin-space. If I customize
`glyphless-char-display-control' to display hex codes for VS, then it
display a hex box.

So I guess that means weʼre not composing?

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-29 16:13                                   ` Robert Pluim
@ 2023-05-29 17:18                                     ` Eli Zaretskii
  2023-05-30  7:25                                       ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-29 17:18 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Mon, 29 May 2023 18:13:14 +0200
> 
> >>>>> On Mon, 29 May 2023 17:55:49 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     Eli> That's because the first one shows two glyphs that are
>     Eli> "pseudo-composed": not by the font, but by our hand-made "composition"
>     Eli> in compose-gstring-for-graphic.  Try this instead:
> 
>     Eli>       (set-char-table-range
>     Eli>        composition-function-table
>     Eli>        #xFE0F
>     Eli>        '(["\\c.\ufe0f" 1 font-shape-gstring]))
> 
>     Eli> so that we only see a composition if the font indeed agrees to
>     Eli> compose.  What do you see?
> 
> It still displays a single glyph with a thin-space. If I customize
> `glyphless-char-display-control' to display hex codes for VS, then it
> display a hex box.
> 
> So I guess that means weʼre not composing?

What does "C-u C-x =" say in this case?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-29 17:18                                     ` Eli Zaretskii
@ 2023-05-30  7:25                                       ` Robert Pluim
  2023-05-30 12:10                                         ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-30  7:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 29 May 2023 20:18:41 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Mon, 29 May 2023 18:13:14 +0200
    >> 
    >> >>>>> On Mon, 29 May 2023 17:55:49 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >> 
    Eli> That's because the first one shows two glyphs that are
    Eli> "pseudo-composed": not by the font, but by our hand-made "composition"
    Eli> in compose-gstring-for-graphic.  Try this instead:
    >> 
    Eli> (set-char-table-range
    Eli> composition-function-table
    Eli> #xFE0F
    Eli> '(["\\c.\ufe0f" 1 font-shape-gstring]))
    >> 
    Eli> so that we only see a composition if the font indeed agrees to
    Eli> compose.  What do you see?
    >> 
    >> It still displays a single glyph with a thin-space. If I customize
    >> `glyphless-char-display-control' to display hex codes for VS, then it
    >> display a hex box.
    >> 
    >> So I guess that means weʼre not composing?

    Eli> What does "C-u C-x =" say in this case?

It claims itʼs composed:

             position: 146 of 251 (58%), column: 0
            character: 👍 (displayed as 👍) (codepoint 128077, #o372115, #x1f44d)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0x1F44D
               script: emoji
               syntax: w 	which means: word
             category: .:Base
             to input: type "C-x 8 RET 1f44d" or "C-x 8 RET THUMBS UP SIGN"
          buffer code: #xF0 #x9F #x91 #x8D
            file code: #xF0 #x9F #x91 #x8D (encoded by coding system utf-8-unix)
              display: composed to form "👍️" (see below)

Composed with the following character(s) "️" using this font:
  ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-13-*-*-*-m-0-iso10646-1
by these glyphs:
  [0 1 128077 569 16 0 17 13 4 nil]
with these character(s):
  ️ (#xfe0f) VARIATION SELECTOR-16

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-30  7:25                                       ` Robert Pluim
@ 2023-05-30 12:10                                         ` Eli Zaretskii
  2023-05-30 13:30                                           ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-30 12:10 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Tue, 30 May 2023 09:25:52 +0200
> 
> >>>>> On Mon, 29 May 2023 20:18:41 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     Eli> (set-char-table-range
>     Eli> composition-function-table
>     Eli> #xFE0F
>     Eli> '(["\\c.\ufe0f" 1 font-shape-gstring]))
>     >> 
>     Eli> so that we only see a composition if the font indeed agrees to
>     Eli> compose.  What do you see?
>     >> 
>     >> It still displays a single glyph with a thin-space. If I customize
>     >> `glyphless-char-display-control' to display hex codes for VS, then it
>     >> display a hex box.
>     >> 
>     >> So I guess that means weʼre not composing?
> 
>     Eli> What does "C-u C-x =" say in this case?
> 
> It claims itʼs composed:
> 
>              position: 146 of 251 (58%), column: 0
>             character: 👍 (displayed as 👍) (codepoint 128077, #o372115, #x1f44d)
>               charset: unicode (Unicode (ISO10646))
> code point in charset: 0x1F44D
>                script: emoji
>                syntax: w 	which means: word
>              category: .:Base
>              to input: type "C-x 8 RET 1f44d" or "C-x 8 RET THUMBS UP SIGN"
>           buffer code: #xF0 #x9F #x91 #x8D
>             file code: #xF0 #x9F #x91 #x8D (encoded by coding system utf-8-unix)
>               display: composed to form "👍️" (see below)
> 
> Composed with the following character(s) "️" using this font:
>   ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-13-*-*-*-m-0-iso10646-1
> by these glyphs:
>   [0 1 128077 569 16 0 17 13 4 nil]
> with these character(s):
>   ️ (#xfe0f) VARIATION SELECTOR-16

Which means it _is_ composed.  Moreover, with Noto Color Emoji we get
a single glyph.  On my system, I have Noto Emoji, from which I get two
glyphs:

  [0 1 128077 422 17 1 15 12 2 nil]
  [0 1 65039 3 17 0 1 0 1 [0 0 0]]

(in which case I can understand why the second one is displayed as a
hex box if I customize glyphless-char-display-control).

So, given that this is the case, why is this wrong, again?  If the
font and the shaper produce two glyphs, or one glyph that looks like
two, why should we think it's an Emacs's problem?

(I verified that Emacs 28 shows the same, so this is not a recent
regression.)





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-30 12:10                                         ` Eli Zaretskii
@ 2023-05-30 13:30                                           ` Robert Pluim
  2023-05-30 16:32                                             ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-30 13:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

[-- Attachment #1: Type: text/plain, Size: 1216 bytes --]

>>>>> On Tue, 30 May 2023 15:10:45 +0300, Eli Zaretskii <eliz@gnu.org> said:

    Eli> Which means it _is_ composed.  Moreover, with Noto Color Emoji we get
    Eli> a single glyph.  On my system, I have Noto Emoji, from which I get two
    Eli> glyphs:

    Eli>   [0 1 128077 422 17 1 15 12 2 nil]
    Eli>   [0 1 65039 3 17 0 1 0 1 [0 0 0]]

    Eli> (in which case I can understand why the second one is displayed as a
    Eli> hex box if I customize glyphless-char-display-control).

But I also get a hex box if I customize
glyphless-char-display-control, even though 'C-u C-x =' claims thereʼs
only one glyph.

    Eli> So, given that this is the case, why is this wrong, again?  If the
    Eli> font and the shaper produce two glyphs, or one glyph that looks like
    Eli> two, why should we think it's an Emacs's problem?

Because Emacs behaves differently depending on whether we have a
composition rule for FE0F that looks backwards or one for 1F44D that
looks forwards. The sequence in both cases is

U+1F44D U+FE0F U+7C U+61
U+1F44D U+7C U+61

(set-char-table-range
 composition-function-table
 #xFE0F
 '(["\\c.\ufe0f" 1 font-shape-gstring]))

produces the following:


[-- Attachment #2: backward-composition.png --]
[-- Type: image/png, Size: 7288 bytes --]

[-- Attachment #3: Type: text/plain, Size: 411 bytes --]


There is a (very) thin space that shouldnʼt be there between the 1f44d
and the '|' on the line that has the FE0F (and since it follows the
value of glyphless-char-display-control, I donʼt think
it comes from the shaping engine).

but

(set-char-table-range
 composition-function-table
 #x1F44D 
'(["\U0001f44d\ufe0f" 0 font-shape-gstring]))

gives me this, where the two '|' align perfectly.


[-- Attachment #4: forward-composition.png --]
[-- Type: image/png, Size: 7224 bytes --]

[-- Attachment #5: Type: text/plain, Size: 140 bytes --]


(as an experiment, I hacked 'produce_glyphless_glyph' to skip
displaying variation selectors, and the problem disappears).

thanks

Robert

^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-30 13:30                                           ` Robert Pluim
@ 2023-05-30 16:32                                             ` Eli Zaretskii
  2023-05-31 16:11                                               ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-30 16:32 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Tue, 30 May 2023 15:30:58 +0200
> 
> >>>>> On Tue, 30 May 2023 15:10:45 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     Eli> Which means it _is_ composed.  Moreover, with Noto Color Emoji we get
>     Eli> a single glyph.  On my system, I have Noto Emoji, from which I get two
>     Eli> glyphs:
> 
>     Eli>   [0 1 128077 422 17 1 15 12 2 nil]
>     Eli>   [0 1 65039 3 17 0 1 0 1 [0 0 0]]
> 
>     Eli> (in which case I can understand why the second one is displayed as a
>     Eli> hex box if I customize glyphless-char-display-control).
> 
> But I also get a hex box if I customize
> glyphless-char-display-control, even though 'C-u C-x =' claims thereʼs
> only one glyph.
> 
>     Eli> So, given that this is the case, why is this wrong, again?  If the
>     Eli> font and the shaper produce two glyphs, or one glyph that looks like
>     Eli> two, why should we think it's an Emacs's problem?
> 
> Because Emacs behaves differently depending on whether we have a
> composition rule for FE0F that looks backwards or one for 1F44D that
> looks forwards. The sequence in both cases is
> 
> U+1F44D U+FE0F U+7C U+61
> U+1F44D U+7C U+61
> 
> (set-char-table-range
>  composition-function-table
>  #xFE0F
>  '(["\\c.\ufe0f" 1 font-shape-gstring]))
> 
> produces the following:
> 
> There is a (very) thin space that shouldnʼt be there between the 1f44d
> and the '|' on the line that has the FE0F (and since it follows the
> value of glyphless-char-display-control, I donʼt think
> it comes from the shaping engine).

OK, here's the scoop: there's no composition there.  "C-u C-x =" says
there is, but that's a lie: when I look in GDB at the glyphs actually
shown there, there's no composition glyphs, only the glyph for U+1F44D
followed by a glyph for U+FE0F.

> but
> 
> (set-char-table-range
>  composition-function-table
>  #x1F44D 
> '(["\U0001f44d\ufe0f" 0 font-shape-gstring]))
> 
> gives me this, where the two '|' align perfectly.

Here, there _is_ a composition.

So there are two issues here: (a) why there's no composition in the
first case, and (b) why does "C-u C-x =" says there is when there
isn't.





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-30 16:32                                             ` Eli Zaretskii
@ 2023-05-31 16:11                                               ` Robert Pluim
  2023-05-31 16:18                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-05-31 16:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Tue, 30 May 2023 19:32:23 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> (set-char-table-range
    >> composition-function-table
    >> #x1F44D 
    >> '(["\U0001f44d\ufe0f" 0 font-shape-gstring]))
    >> 
    >> gives me this, where the two '|' align perfectly.

    Eli> Here, there _is_ a composition.

    Eli> So there are two issues here: (a) why there's no composition in the
    Eli> first case, and (b) why does "C-u C-x =" says there is when there
    Eli> isn't.

OK. I can poke around in gdb if you give me some idea of what I should
be looking at.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-31 16:11                                               ` Robert Pluim
@ 2023-05-31 16:18                                                 ` Eli Zaretskii
  2023-06-01 12:43                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-05-31 16:18 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Wed, 31 May 2023 18:11:36 +0200
> 
> >>>>> On Tue, 30 May 2023 19:32:23 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     >> (set-char-table-range
>     >> composition-function-table
>     >> #x1F44D 
>     >> '(["\U0001f44d\ufe0f" 0 font-shape-gstring]))
>     >> 
>     >> gives me this, where the two '|' align perfectly.
> 
>     Eli> Here, there _is_ a composition.
> 
>     Eli> So there are two issues here: (a) why there's no composition in the
>     Eli> first case, and (b) why does "C-u C-x =" says there is when there
>     Eli> isn't.
> 
> OK. I can poke around in gdb if you give me some idea of what I should
> be looking at.

I don't really know.  I plan to just step through the code in
composite.c tomorrow, unless you beat me to it.  Once we understand
issue (a), I think we will also understand issue (b).





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-05-31 16:18                                                 ` Eli Zaretskii
@ 2023-06-01 12:43                                                   ` Eli Zaretskii
  2023-06-01 13:30                                                     ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-01 12:43 UTC (permalink / raw)
  To: rpluim; +Cc: 63731, steven

> Cc: 63731@debbugs.gnu.org, steven@stebalien.com
> Date: Wed, 31 May 2023 19:18:22 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > From: Robert Pluim <rpluim@gmail.com>
> > Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> > Date: Wed, 31 May 2023 18:11:36 +0200
> > 
> >     Eli> So there are two issues here: (a) why there's no composition in the
> >     Eli> first case, and (b) why does "C-u C-x =" says there is when there
> >     Eli> isn't.
> > 
> > OK. I can poke around in gdb if you give me some idea of what I should
> > be looking at.
> 
> I don't really know.  I plan to just step through the code in
> composite.c tomorrow, unless you beat me to it.  Once we understand
> issue (a), I think we will also understand issue (b).

OK, the issue is quite clear even without stepping with a debugger.

Bottom line: we cannot support a situation where the same character
can be composed by more than one slot in composition-function-table.
If there are more than a single slot for the same character, one of
them will be tried, and the rest will be ignored (not even tried).
In particular, if a character CH has a "forward" composition rule that
starts with itself, and also has a "backward" rule (one with non-zero
look-back parameter) triggered by a different character (which should
follow CH), the latter rule will never be tried.

This is what happens in this case: the character #x1F44D has several
rules that start with itself in emoji-zwj.el:

  (#x1F44D .
  ,(eval-when-compile (regexp-opt
   '(
   "\N{U+1F44D}\N{U+1F3FB}"
   "\N{U+1F44D}\N{U+1F3FC}"
   "\N{U+1F44D}\N{U+1F3FD}"
   "\N{U+1F44D}\N{U+1F3FE}"
   "\N{U+1F44D}\N{U+1F3FF}"
   ))))

and it also has a "backward" rule:

  (set-char-table-range
   composition-function-table
   #xFE0F '(["\\c.\ufe0f" 1 font-shape-gstring]))

The latter is triggered by #xFE0F and has a 1-character look-back,
which will match #x1F44D, since its category is '.' (it's a "base
character").  This latter rule is never tried.  Why? because the
former rules, anchored at #X1F44D, are tried first (Emacs redisplay
examines characters in the order of their buffer positions), and fail
to match.  When those rules fail to match, due to how the
composition-related functions called by the display engine are
factored, we never again consider compositions triggered by a later
character which "cover" also #x1F44D: once that position was examined
and the attempted composition failed, we move to the next character.
IOW, we assume that this first set of composition rules we find for a
given character are the only ones that could possibly be relevant for
that character.

Which means that to have #xFE0F compose correctly with Emoji
codepoints, we should include #xFE0F in the sequences in emoji-zwj.el.

The reason why "C-u C-x =" lies to us saying there's a composition
where really there isn't is because descr-text.el uses the
find-composition primitive, whose implementation is parallel and
separate from that of the display-engine routines, and is structured
differently.  So find-composition does succeed to detect the second
rule, the one triggered by #xFE0F, which the display engine ignores.
I will think whether this can be fixed, to avoid such false positives,
but if we accept that there can be only one set of composition rules
for a character, then we basically invoked undefined behavior here,
and we got what we deserved.

Thanks.





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-01 12:43                                                   ` Eli Zaretskii
@ 2023-06-01 13:30                                                     ` Robert Pluim
  2023-06-01 16:10                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-01 13:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Thu, 01 Jun 2023 15:43:26 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> Cc: 63731@debbugs.gnu.org, steven@stebalien.com
    >> Date: Wed, 31 May 2023 19:18:22 +0300
    >> From: Eli Zaretskii <eliz@gnu.org>
    >> 
    >> > From: Robert Pluim <rpluim@gmail.com>
    >> > Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> > Date: Wed, 31 May 2023 18:11:36 +0200
    >> > 
    >> >     Eli> So there are two issues here: (a) why there's no composition in the
    >> >     Eli> first case, and (b) why does "C-u C-x =" says there is when there
    >> >     Eli> isn't.
    >> > 
    >> > OK. I can poke around in gdb if you give me some idea of what I should
    >> > be looking at.
    >> 
    >> I don't really know.  I plan to just step through the code in
    >> composite.c tomorrow, unless you beat me to it.  Once we understand
    >> issue (a), I think we will also understand issue (b).

    Eli> OK, the issue is quite clear even without stepping with a debugger.

    Eli> Bottom line: we cannot support a situation where the same character
    Eli> can be composed by more than one slot in composition-function-table.
    Eli> If there are more than a single slot for the same character, one of
    Eli> them will be tried, and the rest will be ignored (not even tried).
    Eli> In particular, if a character CH has a "forward" composition rule that
    Eli> starts with itself, and also has a "backward" rule (one with non-zero
    Eli> look-back parameter) triggered by a different character (which should
    Eli> follow CH), the latter rule will never be tried.

OK, that makes sense. Where would be a good place to document this?

    Eli> This is what happens in this case: the character #x1F44D has several
    Eli> rules that start with itself in emoji-zwj.el:

    Eli>   (#x1F44D .
    Eli>   ,(eval-when-compile (regexp-opt
    Eli>    '(
    Eli>    "\N{U+1F44D}\N{U+1F3FB}"
    Eli>    "\N{U+1F44D}\N{U+1F3FC}"
    Eli>    "\N{U+1F44D}\N{U+1F3FD}"
    Eli>    "\N{U+1F44D}\N{U+1F3FE}"
    Eli>    "\N{U+1F44D}\N{U+1F3FF}"
    Eli>    ))))

    Eli> and it also has a "backward" rule:

    Eli>   (set-char-table-range
    Eli>    composition-function-table
    Eli>    #xFE0F '(["\\c.\ufe0f" 1 font-shape-gstring]))

    Eli> The latter is triggered by #xFE0F and has a 1-character look-back,
    Eli> which will match #x1F44D, since its category is '.' (it's a "base
    Eli> character").  This latter rule is never tried.  Why? because the
    Eli> former rules, anchored at #X1F44D, are tried first (Emacs redisplay
    Eli> examines characters in the order of their buffer positions), and fail
    Eli> to match.  When those rules fail to match, due to how the
    Eli> composition-related functions called by the display engine are
    Eli> factored, we never again consider compositions triggered by a later
    Eli> character which "cover" also #x1F44D: once that position was examined
    Eli> and the attempted composition failed, we move to the next character.
    Eli> IOW, we assume that this first set of composition rules we find for a
    Eli> given character are the only ones that could possibly be relevant for
    Eli> that character.

    Eli> Which means that to have #xFE0F compose correctly with Emoji
    Eli> codepoints, we should include #xFE0F in the sequences in emoji-zwj.el.

Thatʼs easy enough:

diff --git a/admin/unidata/emoji-zwj.awk b/admin/unidata/emoji-zwj.awk
index 7d2ff6cb900..d1195ebbad8 100644
--- a/admin/unidata/emoji-zwj.awk
+++ b/admin/unidata/emoji-zwj.awk
@@ -106,7 +106,8 @@ END {
 
      for (elt in ch)
     {
-        printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, vec[elt])
+        entries = sprintf("%s\n\"\\N{U+%s}\\N{U+FE0F}\"", vec[elt], elt)
+        printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, entries)
     }
      print "))"
      print "  (set-char-table-range composition-function-table"

That makes all the VS-16 sequences in
admin/unidata/emoji-variation-sequences.txt display with the emoji
font for me.

    Eli> The reason why "C-u C-x =" lies to us saying there's a composition
    Eli> where really there isn't is because descr-text.el uses the
    Eli> find-composition primitive, whose implementation is parallel and
    Eli> separate from that of the display-engine routines, and is structured
    Eli> differently.  So find-composition does succeed to detect the second
    Eli> rule, the one triggered by #xFE0F, which the display engine ignores.
    Eli> I will think whether this can be fixed, to avoid such false positives,
    Eli> but if we accept that there can be only one set of composition rules
    Eli> for a character, then we basically invoked undefined behavior here,
    Eli> and we got what we deserved.

If find-composition DTRT, could we not use it in the display engine?

Robert
-- 





^ permalink raw reply related	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-01 13:30                                                     ` Robert Pluim
@ 2023-06-01 16:10                                                       ` Eli Zaretskii
  2023-06-01 16:34                                                         ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-01 16:10 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Thu, 01 Jun 2023 15:30:18 +0200
> 
>     Eli> OK, the issue is quite clear even without stepping with a debugger.
> 
>     Eli> Bottom line: we cannot support a situation where the same character
>     Eli> can be composed by more than one slot in composition-function-table.
>     Eli> If there are more than a single slot for the same character, one of
>     Eli> them will be tried, and the rest will be ignored (not even tried).
>     Eli> In particular, if a character CH has a "forward" composition rule that
>     Eli> starts with itself, and also has a "backward" rule (one with non-zero
>     Eli> look-back parameter) triggered by a different character (which should
>     Eli> follow CH), the latter rule will never be tried.
> 
> OK, that makes sense. Where would be a good place to document this?

In the doc string of composition-function-table, I think.  We already
document there the caveat of arranging rules in descending order of
look-back, which is part of the same "misfeature".

>     Eli> Which means that to have #xFE0F compose correctly with Emoji
>     Eli> codepoints, we should include #xFE0F in the sequences in emoji-zwj.el.
> 
> Thatʼs easy enough:
> 
> diff --git a/admin/unidata/emoji-zwj.awk b/admin/unidata/emoji-zwj.awk
> index 7d2ff6cb900..d1195ebbad8 100644
> --- a/admin/unidata/emoji-zwj.awk
> +++ b/admin/unidata/emoji-zwj.awk
> @@ -106,7 +106,8 @@ END {
>  
>       for (elt in ch)
>      {
> -        printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, vec[elt])
> +        entries = sprintf("%s\n\"\\N{U+%s}\\N{U+FE0F}\"", vec[elt], elt)
> +        printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, entries)
>      }
>       print "))"
>       print "  (set-char-table-range composition-function-table"
> 
> That makes all the VS-16 sequences in
> admin/unidata/emoji-variation-sequences.txt display with the emoji
> font for me.

Ready to install this on the emacs-29 branch?

>     Eli> The reason why "C-u C-x =" lies to us saying there's a composition
>     Eli> where really there isn't is because descr-text.el uses the
>     Eli> find-composition primitive, whose implementation is parallel and
>     Eli> separate from that of the display-engine routines, and is structured
>     Eli> differently.  So find-composition does succeed to detect the second
>     Eli> rule, the one triggered by #xFE0F, which the display engine ignores.
>     Eli> I will think whether this can be fixed, to avoid such false positives,
>     Eli> but if we accept that there can be only one set of composition rules
>     Eli> for a character, then we basically invoked undefined behavior here,
>     Eli> and we got what we deserved.
> 
> If find-composition DTRT, could we not use it in the display engine?

Not easily, because the display code calls subroutines of
find-composition in a certain order, and that's what causes the
behavior I described.

And even if we could make this happen, I'm not sure we should:
basically, having multiple matching slots would mean users and callers
will never be sure which one "wins".





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-01 16:10                                                       ` Eli Zaretskii
@ 2023-06-01 16:34                                                         ` Robert Pluim
  2023-06-02  8:15                                                           ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-01 16:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Thu, 01 Jun 2023 19:10:16 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Thu, 01 Jun 2023 15:30:18 +0200
    >> 
    Eli> OK, the issue is quite clear even without stepping with a debugger.
    >> 
    Eli> Bottom line: we cannot support a situation where the same character
    Eli> can be composed by more than one slot in composition-function-table.
    Eli> If there are more than a single slot for the same character, one of
    Eli> them will be tried, and the rest will be ignored (not even tried).
    Eli> In particular, if a character CH has a "forward" composition rule that
    Eli> starts with itself, and also has a "backward" rule (one with non-zero
    Eli> look-back parameter) triggered by a different character (which should
    Eli> follow CH), the latter rule will never be tried.
    >> 
    >> OK, that makes sense. Where would be a good place to document this?

    Eli> In the doc string of composition-function-table, I think.  We already
    Eli> document there the caveat of arranging rules in descending order of
    Eli> look-back, which is part of the same "misfeature".

OK. Iʼll see if I can come up with something (or Iʼll just steal what
you wrote above :-)).

    >> That makes all the VS-16 sequences in
    >> admin/unidata/emoji-variation-sequences.txt display with the emoji
    >> font for me.

    Eli> Ready to install this on the emacs-29 branch?

Not today. My brain is fuzzy, and it needs more testing (the patch,
not my brain).

    >> If find-composition DTRT, could we not use it in the display engine?

    Eli> Not easily, because the display code calls subroutines of
    Eli> find-composition in a certain order, and that's what causes the
    Eli> behavior I described.

    Eli> And even if we could make this happen, I'm not sure we should:
    Eli> basically, having multiple matching slots would mean users and callers
    Eli> will never be sure which one "wins".

Yes, at least the semantics are clear (now that we know what they
are).

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-01 16:34                                                         ` Robert Pluim
@ 2023-06-02  8:15                                                           ` Robert Pluim
  2023-06-02 12:06                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-02  8:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Thu, 01 Jun 2023 18:34:53 +0200, Robert Pluim <rpluim@gmail.com> said:

    Eli> Ready to install this on the emacs-29 branch?

    Robert> Not today. My brain is fuzzy, and it needs more testing (the patch,
    Robert> not my brain).

So the minimal change to get CHAR+VS-15 and CHAR+VS-16 to compose in
all our emoji test files is below. I noticed that we donʼt compose all
the sequences in emoji-test.txt correctly, but Iʼll fix that on master
by stealing^Wdrawing inspiration from Larsʼ work.

Proper VS-15 support is harder, I need to think about that some more.

diff --git c/admin/unidata/emoji-zwj.awk i/admin/unidata/emoji-zwj.awk
index 7d2ff6cb900..f13f796bcac 100644
--- c/admin/unidata/emoji-zwj.awk
+++ i/admin/unidata/emoji-zwj.awk
@@ -106,7 +106,8 @@ END {
 
      for (elt in ch)
     {
-        printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, vec[elt])
+        entries = sprintf("%s\n\"\\N{U+%s}\\N{U+FE0E}\"\n\"\\N{U+%s}\\N{U+FE0F}\"", vec[elt], elt, elt)
+        printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, entries)
     }
      print "))"
      print "  (set-char-table-range composition-function-table"
diff --git c/lisp/composite.el i/lisp/composite.el
index fb8b76114f4..9710c3c371b 100644
--- c/lisp/composite.el
+++ i/lisp/composite.el
@@ -861,7 +861,7 @@ compose-gstring-for-variation-glyph
 ;; handled in font_range, we end up choosing the Emoji presentation
 ;; rather than the Text presentation.
 (let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
-  (set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
+  (set-char-table-range composition-function-table '(#xFE00 . #xFE0D) elt)
   (set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))
 
 (defun auto-compose-chars (func from to font-object string direction)


Robert
-- 





^ permalink raw reply related	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-02  8:15                                                           ` Robert Pluim
@ 2023-06-02 12:06                                                             ` Eli Zaretskii
  2023-06-02 12:25                                                               ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-02 12:06 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Fri, 02 Jun 2023 10:15:08 +0200
> 
> >>>>> On Thu, 01 Jun 2023 18:34:53 +0200, Robert Pluim <rpluim@gmail.com> said:
> 
>     Eli> Ready to install this on the emacs-29 branch?
> 
>     Robert> Not today. My brain is fuzzy, and it needs more testing (the patch,
>     Robert> not my brain).
> 
> So the minimal change to get CHAR+VS-15 and CHAR+VS-16 to compose in
> all our emoji test files is below. I noticed that we donʼt compose all
> the sequences in emoji-test.txt correctly, but Iʼll fix that on master
> by stealing^Wdrawing inspiration from Larsʼ work.

Thanks, please install this on the emacs-29 branch.

> Proper VS-15 support is harder, I need to think about that some more.

Can you describe here the current problems with VS-15?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-02 12:06                                                             ` Eli Zaretskii
@ 2023-06-02 12:25                                                               ` Robert Pluim
  2023-06-02 12:58                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-02 12:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

tags 63731 fixed
close 63731 29.1
quit

>>>>> On Fri, 02 Jun 2023 15:06:32 +0300, Eli Zaretskii <eliz@gnu.org> said:

    Eli> Thanks, please install this on the emacs-29 branch.

Closing.
Committed as 2f94f6de9d6

    >> Proper VS-15 support is harder, I need to think about that some more.

    Eli> Can you describe here the current problems with VS-15?

CHAR+VS-15 and CHAR+VS-16 correctly choose text and emoji
representation, but CHAR+VS-15 results in the text representation only
if CHAR is not an emoji. If it is an emoji, the font selected for it
will always be the emoji font.

Iʼve tried forcing font_range to use the font for the 'symbol' script
for EMOJI+VS-15, instead, but that resulted in composition
failing. Maybe there are some more dragons lurking in the composition
rules.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-02 12:25                                                               ` Robert Pluim
@ 2023-06-02 12:58                                                                 ` Eli Zaretskii
  2023-06-02 13:58                                                                   ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-02 12:58 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Fri, 02 Jun 2023 14:25:28 +0200
> 
>     Eli> Thanks, please install this on the emacs-29 branch.
> 
> Closing.
> Committed as 2f94f6de9d6

Thanks.

>     >> Proper VS-15 support is harder, I need to think about that some more.
> 
>     Eli> Can you describe here the current problems with VS-15?
> 
> CHAR+VS-15 and CHAR+VS-16 correctly choose text and emoji
> representation, but CHAR+VS-15 results in the text representation only
> if CHAR is not an emoji. If it is an emoji, the font selected for it
> will always be the emoji font.

And an Emoji font, when presented with CHAR+VS-15 sequence doesn't
produce a textual-representation glyph for CHAR?  I'd expect it to.

If Emoji fonts don't produce textual-representation glyphs in this
case, I wonder how can this work at all.  Because if we select some
non-Emoji font, it will probably not know about VS-15, so we will be
left with VS-15.  Are we supposed to handle that ourselves, instead of
relying on the font and the shaping engine?

> Iʼve tried forcing font_range to use the font for the 'symbol' script
> for EMOJI+VS-15, instead, but that resulted in composition
> failing.

That's what I'd expect: non-Emoji fonts don't know about VS-15.

What does HarfBuzz's hb-view do with such sequences, when using Noto
Color Emoji font?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-02 12:58                                                                 ` Eli Zaretskii
@ 2023-06-02 13:58                                                                   ` Robert Pluim
  2023-06-03  5:36                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-02 13:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Fri, 02 Jun 2023 15:58:05 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> CHAR+VS-15 and CHAR+VS-16 correctly choose text and emoji
    >> representation, but CHAR+VS-15 results in the text representation only
    >> if CHAR is not an emoji. If it is an emoji, the font selected for it
    >> will always be the emoji font.

    Eli> And an Emoji font, when presented with CHAR+VS-15 sequence doesn't
    Eli> produce a textual-representation glyph for CHAR?  I'd expect it to.

No.

    Eli> If Emoji fonts don't produce textual-representation glyphs in this
    Eli> case, I wonder how can this work at all.  Because if we select some
    Eli> non-Emoji font, it will probably not know about VS-15, so we will be
    Eli> left with VS-15.  Are we supposed to handle that ourselves, instead of
    Eli> relying on the font and the shaping engine?

    >> Iʼve tried forcing font_range to use the font for the 'symbol' script
    >> for EMOJI+VS-15, instead, but that resulted in composition
    >> failing.

Itʼs finding what appears to be the default system font, not whatʼs
specified in the fontset for 'symbol', so thatʼs one reason why
composition fails. Even with 'use-default-font-for-symbols' nil.

    Eli> That's what I'd expect: non-Emoji fonts don't know about VS-15.

Right

    Eli> What does HarfBuzz's hb-view do with such sequences, when using Noto
    Eli> Color Emoji font?

Sequence       Font             Result
23e9 fe0e      system           black box
23e9 fe0e      Symbola          correct text representation
23e9 fe0e      NotoEmoji        correct text representation
23e9 fe0e      NotoColorEmoji   blank

And on emacs-29, Symbola and NotoEmoji compose that sequence
correctly. Now I just need to persuade emacs-30 to use one of them.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-02 13:58                                                                   ` Robert Pluim
@ 2023-06-03  5:36                                                                     ` Eli Zaretskii
  2023-06-05 13:08                                                                       ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-03  5:36 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Fri, 02 Jun 2023 15:58:37 +0200
> 
>     Eli> What does HarfBuzz's hb-view do with such sequences, when using Noto
>     Eli> Color Emoji font?
> 
> Sequence       Font             Result
> 23e9 fe0e      system           black box
> 23e9 fe0e      Symbola          correct text representation
> 23e9 fe0e      NotoEmoji        correct text representation
> 23e9 fe0e      NotoColorEmoji   blank
> 
> And on emacs-29, Symbola and NotoEmoji compose that sequence
> correctly. Now I just need to persuade emacs-30 to use one of them.

So you are saying that, in our default fontset, we should specify that
#xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
and then make sure that font_range uses the same font for the likes of
#x23E9?  IOW, specify a different font for VS-15 even though is script
is 'emoji'?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-03  5:36                                                                     ` Eli Zaretskii
@ 2023-06-05 13:08                                                                       ` Robert Pluim
  2023-06-05 13:12                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-05 13:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Sat, 03 Jun 2023 08:36:59 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Fri, 02 Jun 2023 15:58:37 +0200
    >> 
    Eli> What does HarfBuzz's hb-view do with such sequences, when using Noto
    Eli> Color Emoji font?
    >> 
    >> Sequence       Font             Result
    >> 23e9 fe0e      system           black box
    >> 23e9 fe0e      Symbola          correct text representation
    >> 23e9 fe0e      NotoEmoji        correct text representation
    >> 23e9 fe0e      NotoColorEmoji   blank
    >> 
    >> And on emacs-29, Symbola and NotoEmoji compose that sequence
    >> correctly. Now I just need to persuade emacs-30 to use one of them.

    Eli> So you are saying that, in our default fontset, we should specify that
    Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
    Eli> and then make sure that font_range uses the same font for the likes of
    Eli> #x23E9?  IOW, specify a different font for VS-15 even though is script
    Eli> is 'emoji'?

Yes, that works (and we can remove VS-15 and VS-16 from the emoji
script, so that theyʼll then be displayed via
`glyphless-char-display-control' when theyʼre on their own).

Thanks for the suggestion Eli, I was looking at it from the wrong
direction.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 13:08                                                                       ` Robert Pluim
@ 2023-06-05 13:12                                                                         ` Eli Zaretskii
  2023-06-05 13:31                                                                           ` Eli Zaretskii
  2023-06-05 13:36                                                                           ` Robert Pluim
  0 siblings, 2 replies; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-05 13:12 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Mon, 05 Jun 2023 15:08:08 +0200
> 
> >>>>> On Sat, 03 Jun 2023 08:36:59 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     >> Sequence       Font             Result
>     >> 23e9 fe0e      system           black box
>     >> 23e9 fe0e      Symbola          correct text representation
>     >> 23e9 fe0e      NotoEmoji        correct text representation
>     >> 23e9 fe0e      NotoColorEmoji   blank
>     >> 
>     >> And on emacs-29, Symbola and NotoEmoji compose that sequence
>     >> correctly. Now I just need to persuade emacs-30 to use one of them.
> 
>     Eli> So you are saying that, in our default fontset, we should specify that
>     Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
>     Eli> and then make sure that font_range uses the same font for the likes of
>     Eli> #x23E9?  IOW, specify a different font for VS-15 even though is script
>     Eli> is 'emoji'?
> 
> Yes, that works (and we can remove VS-15 and VS-16 from the emoji
> script, so that theyʼll then be displayed via
> `glyphless-char-display-control' when theyʼre on their own).

What about the rest of VS-nn? do they need to stay in 'emoji' script,
and if so, why?

> Thanks for the suggestion Eli, I was looking at it from the wrong
> direction.

You are the one who did most of the footwork, so kudos to you.

This is simple enough to install on emacs-29, I think?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 13:12                                                                         ` Eli Zaretskii
@ 2023-06-05 13:31                                                                           ` Eli Zaretskii
  2023-06-05 14:06                                                                             ` Robert Pluim
  2023-06-05 13:36                                                                           ` Robert Pluim
  1 sibling, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-05 13:31 UTC (permalink / raw)
  To: rpluim; +Cc: 63731, steven

> Cc: 63731@debbugs.gnu.org, steven@stebalien.com
> Date: Mon, 05 Jun 2023 16:12:20 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> >     Eli> So you are saying that, in our default fontset, we should specify that
> >     Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
> >     Eli> and then make sure that font_range uses the same font for the likes of
> >     Eli> #x23E9?  IOW, specify a different font for VS-15 even though is script
> >     Eli> is 'emoji'?
> > 
> > Yes, that works (and we can remove VS-15 and VS-16 from the emoji
> > script, so that theyʼll then be displayed via
> > `glyphless-char-display-control' when theyʼre on their own).
> 
> What about the rest of VS-nn? do they need to stay in 'emoji' script,
> and if so, why?

And one more question: if we remove VS-16 from the emoji script, what
will happen to the sequences like U+23E9 U+FE0F?  Isn't it true that
we use a color Emoji font for those because VS-16 is in emoji script?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 13:12                                                                         ` Eli Zaretskii
  2023-06-05 13:31                                                                           ` Eli Zaretskii
@ 2023-06-05 13:36                                                                           ` Robert Pluim
  2023-06-05 13:47                                                                             ` Eli Zaretskii
  1 sibling, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-05 13:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 05 Jun 2023 16:12:20 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Mon, 05 Jun 2023 15:08:08 +0200
    >> 
    >> >>>>> On Sat, 03 Jun 2023 08:36:59 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >> 
    >> >> Sequence       Font             Result
    >> >> 23e9 fe0e      system           black box
    >> >> 23e9 fe0e      Symbola          correct text representation
    >> >> 23e9 fe0e      NotoEmoji        correct text representation
    >> >> 23e9 fe0e      NotoColorEmoji   blank
    >> >> 
    >> >> And on emacs-29, Symbola and NotoEmoji compose that sequence
    >> >> correctly. Now I just need to persuade emacs-30 to use one of them.
    >> 
    Eli> So you are saying that, in our default fontset, we should specify that
    Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
    Eli> and then make sure that font_range uses the same font for the likes of
    Eli> #x23E9?  IOW, specify a different font for VS-15 even though is script
    Eli> is 'emoji'?
    >> 
    >> Yes, that works (and we can remove VS-15 and VS-16 from the emoji
    >> script, so that theyʼll then be displayed via
    >> `glyphless-char-display-control' when theyʼre on their own).

    Eli> What about the rest of VS-nn? do they need to stay in 'emoji' script,
    Eli> and if so, why?

They were never in the 'emoji' script anyway.

    >> Thanks for the suggestion Eli, I was looking at it from the wrong
    >> direction.

    Eli> You are the one who did most of the footwork, so kudos to you.

    Eli> This is simple enough to install on emacs-29, I think?

The main change is in font.c, and looks like this. I think itʼs too
big for emacs-29 (breaking composition is very easy, itʼs entirely
possible Iʼve missed a few cases :-) )

diff --git a/src/font.c b/src/font.c
index e586277a5d3..30b088c818e 100644
--- a/src/font.c
+++ b/src/font.c
@@ -3633,10 +3633,14 @@ font_at (int c, ptrdiff_t pos, struct face *face, struct window *w,
 /* Check if CH is a codepoint for which we should attempt to use the
    emoji font, even if the codepoint itself has Emoji_Presentation =
    No.  Vauto_composition_emoji_eligible_codepoints is filled in for
-   us by admin/unidata/emoji-zwj.awk.  */
+   us by admin/unidata/emoji-zwj.awk.  We also check if there's a
+   VS-15 or VS-16 following CH, and select text/emoji presentation
+   respectively if so.  */
 static bool
-codepoint_is_emoji_eligible (int ch)
+codepoint_is_font_change_eligible (int ch, int next_c)
 {
+  if (next_c == 0xFE0E || next_c == 0xFE0F)
+    return true;
   if (EQ (CHAR_TABLE_REF (Vchar_script_table, ch), Qemoji))
     return true;
 
@@ -3690,21 +3694,43 @@ font_range (ptrdiff_t pos, ptrdiff_t pos_byte, ptrdiff_t *limit,
 	}
       face = FACE_FROM_ID (f, face_id);
     }
-
-  /* If the composition was triggered by an emoji, use a character
-     from 'script-representative-chars', rather than the first
-     character in the string, to determine the font to use.  */
-  if (codepoint_is_emoji_eligible (ch))
+  int next_c = 0;
+  {
+    ptrdiff_t p = pos;
+    ptrdiff_t p_b = pos_byte;
+    int c;
+    c = (NILP (string)
+	 ? fetch_char_advance_no_check (&p, &p_b)
+	 : fetch_string_char_advance_no_check (string, &p, &p_b));
+    if (p < *limit)
+      {
+	c = (NILP (string)
+	     ? fetch_char_advance_no_check (&p, &p_b)
+	     : fetch_string_char_advance_no_check (string, &p, &p_b));
+	next_c = c;
+      }
+  }
+  if (codepoint_is_font_change_eligible (ch, next_c))
     {
-      Lisp_Object val = assq_no_quit (Qemoji, Vscript_representative_chars);
-      if (CONSP (val))
+      if (next_c == 0xFE0E)
 	{
-	  val = XCDR (val);
+	  font_object = font_for_char (face, 0xFE0E, pos, string);
+	}
+      else
+	{
+	  /* If the composition was triggered by an emoji, use a character
+	     from 'script-representative-chars', rather than the first
+	     character in the string, to determine the font to use.  */
+	  Lisp_Object val = assq_no_quit (Qemoji, Vscript_representative_chars);
 	  if (CONSP (val))
-	    val = XCAR (val);
-	  else if (VECTORP (val))
-	    val = AREF (val, 0);
-	  font_object = font_for_char (face, XFIXNAT (val), pos, string);
+	    {
+	      val = XCDR (val);
+	      if (CONSP (val))
+		val = XCAR (val);
+	      else if (VECTORP (val))
+		val = AREF (val, 0);
+	      font_object = font_for_char (face, XFIXNAT (val), pos, string);
+	    }
 	}
     }
 

Robert
-- 





^ permalink raw reply related	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 13:36                                                                           ` Robert Pluim
@ 2023-06-05 13:47                                                                             ` Eli Zaretskii
  2023-06-05 14:27                                                                               ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-05 13:47 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Mon, 05 Jun 2023 15:36:52 +0200
> 
> >>>>> On Mon, 05 Jun 2023 16:12:20 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     Eli> This is simple enough to install on emacs-29, I think?
> 
> The main change is in font.c, and looks like this. I think itʼs too
> big for emacs-29 (breaking composition is very easy, itʼs entirely
> possible Iʼve missed a few cases :-) )

Hmm... I though just changing the fontset in fontset.el would be
enough.

OK, so I guess master it is, then.





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 13:31                                                                           ` Eli Zaretskii
@ 2023-06-05 14:06                                                                             ` Robert Pluim
  0 siblings, 0 replies; 61+ messages in thread
From: Robert Pluim @ 2023-06-05 14:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 05 Jun 2023 16:31:58 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> Cc: 63731@debbugs.gnu.org, steven@stebalien.com
    >> Date: Mon, 05 Jun 2023 16:12:20 +0300
    >> From: Eli Zaretskii <eliz@gnu.org>
    >> 
    >> >     Eli> So you are saying that, in our default fontset, we should specify that
    >> >     Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
    >> >     Eli> and then make sure that font_range uses the same font for the likes of
    >> >     Eli> #x23E9?  IOW, specify a different font for VS-15 even though is script
    >> >     Eli> is 'emoji'?
    >> > 
    >> > Yes, that works (and we can remove VS-15 and VS-16 from the emoji
    >> > script, so that theyʼll then be displayed via
    >> > `glyphless-char-display-control' when theyʼre on their own).
    >> 
    >> What about the rest of VS-nn? do they need to stay in 'emoji' script,
    >> and if so, why?

    Eli> And one more question: if we remove VS-16 from the emoji script, what
    Eli> will happen to the sequences like U+23E9 U+FE0F?  Isn't it true that
    Eli> we use a color Emoji font for those because VS-16 is in emoji script?

Not anymore. Now we have a forward composition rule for U+23E9
U+FE0F that triggers because U+23E9 is in the emoji script, which is
why U+23E9 U+FE0E also uses the emoji font (currently).

For non-emoji codepoints like U+203C, adding U+FE0F uses the emoji
font because U+FE0F is in the emoji script (and thereʼs no composition
rule for U+203C, so the backwards looking one for U+FE0F is used).

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 13:47                                                                             ` Eli Zaretskii
@ 2023-06-05 14:27                                                                               ` Robert Pluim
  2023-06-05 15:35                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-05 14:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 05 Jun 2023 16:47:22 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Mon, 05 Jun 2023 15:36:52 +0200
    >> 
    >> >>>>> On Mon, 05 Jun 2023 16:12:20 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >> 
    Eli> This is simple enough to install on emacs-29, I think?
    >> 
    >> The main change is in font.c, and looks like this. I think itʼs too
    >> big for emacs-29 (breaking composition is very easy, itʼs entirely
    >> possible Iʼve missed a few cases :-) )

    Eli> Hmm... I though just changing the fontset in fontset.el would be
    Eli> enough.

Itʼs almost enough to do that, and to check if the triggering
character is U+FE0E, bu then we fall foul of the composition rule
forward/backward issue again.

If we could have forward and backwards looking rules working together,
then font_range would get passed U+FE0F or U+FE0E as the triggering
character, it could choose the font, and there would be no need to
peek at the next character.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 14:27                                                                               ` Robert Pluim
@ 2023-06-05 15:35                                                                                 ` Eli Zaretskii
  2023-06-05 15:57                                                                                   ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-05 15:35 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Mon, 05 Jun 2023 16:27:28 +0200
> 
>     Eli> Hmm... I though just changing the fontset in fontset.el would be
>     Eli> enough.
> 
> Itʼs almost enough to do that, and to check if the triggering
> character is U+FE0E, bu then we fall foul of the composition rule
> forward/backward issue again.

Which forward rules would conflict with a backward rule triggered by
U+FE0E?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 15:35                                                                                 ` Eli Zaretskii
@ 2023-06-05 15:57                                                                                   ` Robert Pluim
  2023-06-05 16:20                                                                                     ` Robert Pluim
  2023-06-05 16:39                                                                                     ` Eli Zaretskii
  0 siblings, 2 replies; 61+ messages in thread
From: Robert Pluim @ 2023-06-05 15:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Mon, 05 Jun 2023 16:27:28 +0200
    >> 
    Eli> Hmm... I though just changing the fontset in fontset.el would be
    Eli> enough.
    >> 
    >> Itʼs almost enough to do that, and to check if the triggering
    >> character is U+FE0E, bu then we fall foul of the composition rule
    >> forward/backward issue again.

    Eli> Which forward rules would conflict with a backward rule triggered by
    Eli> U+FE0E?

All the ones for the non-emoji codepoints that still need to be
composed as emoji sometimes, eg U+261D:

"\N{U+261D}"
"\N{U+261D}\N{U+1F3FB}"
"\N{U+261D}\N{U+1F3FC}"
"\N{U+261D}\N{U+1F3FD}"
"\N{U+261D}\N{U+1F3FE}"
"\N{U+261D}\N{U+1F3FF}"

to which we add:

"\N{U+261D}\N{U+FE0E}"
"\N{U+261D}\N{U+FE0F}"

(and not adding those doesnʼt help).

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 15:57                                                                                   ` Robert Pluim
@ 2023-06-05 16:20                                                                                     ` Robert Pluim
  2023-06-05 16:41                                                                                       ` Eli Zaretskii
  2023-06-05 16:39                                                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-05 16:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 05 Jun 2023 17:57:04 +0200, Robert Pluim <rpluim@gmail.com> said:

>>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >>> From: Robert Pluim <rpluim@gmail.com>
    >>> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >>> Date: Mon, 05 Jun 2023 16:27:28 +0200
    >>> 
    Eli> Hmm... I though just changing the fontset in fontset.el would be
    Eli> enough.
    >>> 
    >>> Itʼs almost enough to do that, and to check if the triggering
    >>> character is U+FE0E, bu then we fall foul of the composition rule
    >>> forward/backward issue again.

    Eli> Which forward rules would conflict with a backward rule triggered by
    Eli> U+FE0E?

    Robert> All the ones for the non-emoji codepoints that still need to be
    Robert> composed as emoji sometimes, eg U+261D:

Oh, and all the <foo>+skin tone ones. And probably more.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 15:57                                                                                   ` Robert Pluim
  2023-06-05 16:20                                                                                     ` Robert Pluim
@ 2023-06-05 16:39                                                                                     ` Eli Zaretskii
  2023-06-06  7:28                                                                                       ` Robert Pluim
  1 sibling, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-05 16:39 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Mon, 05 Jun 2023 17:57:04 +0200
> 
> >>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     Eli> Which forward rules would conflict with a backward rule triggered by
>     Eli> U+FE0E?
> 
> All the ones for the non-emoji codepoints that still need to be
> composed as emoji sometimes, eg U+261D:
> 
> "\N{U+261D}"
> "\N{U+261D}\N{U+1F3FB}"
> "\N{U+261D}\N{U+1F3FC}"
> "\N{U+261D}\N{U+1F3FD}"
> "\N{U+261D}\N{U+1F3FE}"
> "\N{U+261D}\N{U+1F3FF}"

Couldn't we put these in the slots of #x1F3FB..#x1F3FF instead, as
backward rules?  As long as we don't have a forward rule starting with
#x261D, we could have backward rules for it triggered by #x1F3Fx and
#xFE0x, right?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 16:20                                                                                     ` Robert Pluim
@ 2023-06-05 16:41                                                                                       ` Eli Zaretskii
  2023-06-06  7:24                                                                                         ` Robert Pluim
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-05 16:41 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Mon, 05 Jun 2023 18:20:08 +0200
> 
>     Eli> Which forward rules would conflict with a backward rule triggered by
>     Eli> U+FE0E?
> 
>     Robert> All the ones for the non-emoji codepoints that still need to be
>     Robert> composed as emoji sometimes, eg U+261D:
> 
> Oh, and all the <foo>+skin tone ones. And probably more.

What do you mean by <foo>+skin?  Can you give a few examples?





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 16:41                                                                                       ` Eli Zaretskii
@ 2023-06-06  7:24                                                                                         ` Robert Pluim
  0 siblings, 0 replies; 61+ messages in thread
From: Robert Pluim @ 2023-06-06  7:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 05 Jun 2023 19:41:55 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Mon, 05 Jun 2023 18:20:08 +0200
    >> 
    Eli> Which forward rules would conflict with a backward rule triggered by
    Eli> U+FE0E?
    >> 
    Robert> All the ones for the non-emoji codepoints that still need to be
    Robert> composed as emoji sometimes, eg U+261D:
    >> 
    >> Oh, and all the <foo>+skin tone ones. And probably more.

    Eli> What do you mean by <foo>+skin?  Can you give a few examples?

Anything using 1F3FB..1F3FF, such as 1F44B 1F3FB or 1F3C4 1F3FB

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-05 16:39                                                                                     ` Eli Zaretskii
@ 2023-06-06  7:28                                                                                       ` Robert Pluim
  2023-06-06 11:53                                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Robert Pluim @ 2023-06-06  7:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 63731, steven

>>>>> On Mon, 05 Jun 2023 19:39:37 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
    >> Date: Mon, 05 Jun 2023 17:57:04 +0200
    >> 
    >> >>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >> 
    Eli> Which forward rules would conflict with a backward rule triggered by
    Eli> U+FE0E?
    >> 
    >> All the ones for the non-emoji codepoints that still need to be
    >> composed as emoji sometimes, eg U+261D:
    >> 
    >> "\N{U+261D}"
    >> "\N{U+261D}\N{U+1F3FB}"
    >> "\N{U+261D}\N{U+1F3FC}"
    >> "\N{U+261D}\N{U+1F3FD}"
    >> "\N{U+261D}\N{U+1F3FE}"
    >> "\N{U+261D}\N{U+1F3FF}"

    Eli> Couldn't we put these in the slots of #x1F3FB..#x1F3FF instead, as
    Eli> backward rules?  As long as we don't have a forward rule starting with
    Eli> #x261D, we could have backward rules for it triggered by #x1F3Fx and
    Eli> #xFE0x, right?

Yes, we could invert the whole composition rules setup, and make them
all work backwards, but then it will almost certainly all break again
with the next release of Unicode. Adding a special case for FE0E in
font_range is going to be more robust.

Robert
-- 





^ permalink raw reply	[flat|nested] 61+ messages in thread

* bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
  2023-06-06  7:28                                                                                       ` Robert Pluim
@ 2023-06-06 11:53                                                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 61+ messages in thread
From: Eli Zaretskii @ 2023-06-06 11:53 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 63731, steven

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
> Date: Tue, 06 Jun 2023 09:28:04 +0200
> 
> >>>>> On Mon, 05 Jun 2023 19:39:37 +0300, Eli Zaretskii <eliz@gnu.org> said:
> 
>     >> From: Robert Pluim <rpluim@gmail.com>
>     >> Cc: 63731@debbugs.gnu.org,  steven@stebalien.com
>     >> Date: Mon, 05 Jun 2023 17:57:04 +0200
>     >> 
>     >> >>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz@gnu.org> said:
>     >> 
>     Eli> Which forward rules would conflict with a backward rule triggered by
>     Eli> U+FE0E?
>     >> 
>     >> All the ones for the non-emoji codepoints that still need to be
>     >> composed as emoji sometimes, eg U+261D:
>     >> 
>     >> "\N{U+261D}"
>     >> "\N{U+261D}\N{U+1F3FB}"
>     >> "\N{U+261D}\N{U+1F3FC}"
>     >> "\N{U+261D}\N{U+1F3FD}"
>     >> "\N{U+261D}\N{U+1F3FE}"
>     >> "\N{U+261D}\N{U+1F3FF}"
> 
>     Eli> Couldn't we put these in the slots of #x1F3FB..#x1F3FF instead, as
>     Eli> backward rules?  As long as we don't have a forward rule starting with
>     Eli> #x261D, we could have backward rules for it triggered by #x1F3Fx and
>     Eli> #xFE0x, right?
> 
> Yes, we could invert the whole composition rules setup, and make them
> all work backwards, but then it will almost certainly all break again
> with the next release of Unicode. Adding a special case for FE0E in
> font_range is going to be more robust.

I don't think it could break, since such sequences are all likely to
be triggered by special codepoints that follow the U+2xxx characters.
Our win would be a much simpler setup.

But okay, let's try to do it this way.





^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2023-06-06 11:53 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-26  3:18 bug#63731: [PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate Steven Allen
2023-05-26  6:41 ` Eli Zaretskii
2023-05-26  8:34   ` Robert Pluim
2023-05-26  8:46     ` Eli Zaretskii
2023-05-26 11:14       ` Robert Pluim
2023-05-26 12:06         ` Eli Zaretskii
2023-05-26 14:02           ` Robert Pluim
2023-05-26 14:55             ` Eli Zaretskii
2023-05-26 15:25               ` Robert Pluim
2023-05-26 15:52                 ` Eli Zaretskii
2023-05-26 16:24                   ` Robert Pluim
2023-05-26 17:27                     ` Eli Zaretskii
2023-05-26 17:35                       ` Robert Pluim
2023-05-26 18:05                         ` Eli Zaretskii
2023-05-28 11:43                           ` Robert Pluim
2023-05-28 12:44                             ` Eli Zaretskii
2023-05-26 17:43                       ` Eli Zaretskii
2023-05-28 10:29                         ` Robert Pluim
2023-05-28 12:37                           ` Eli Zaretskii
2023-05-28 11:57                       ` Robert Pluim
2023-05-28 12:47                         ` Eli Zaretskii
2023-05-29 10:44                           ` Robert Pluim
2023-05-29 13:58                             ` Eli Zaretskii
2023-05-29 14:43                               ` Robert Pluim
2023-05-29 14:55                                 ` Eli Zaretskii
2023-05-29 16:13                                   ` Robert Pluim
2023-05-29 17:18                                     ` Eli Zaretskii
2023-05-30  7:25                                       ` Robert Pluim
2023-05-30 12:10                                         ` Eli Zaretskii
2023-05-30 13:30                                           ` Robert Pluim
2023-05-30 16:32                                             ` Eli Zaretskii
2023-05-31 16:11                                               ` Robert Pluim
2023-05-31 16:18                                                 ` Eli Zaretskii
2023-06-01 12:43                                                   ` Eli Zaretskii
2023-06-01 13:30                                                     ` Robert Pluim
2023-06-01 16:10                                                       ` Eli Zaretskii
2023-06-01 16:34                                                         ` Robert Pluim
2023-06-02  8:15                                                           ` Robert Pluim
2023-06-02 12:06                                                             ` Eli Zaretskii
2023-06-02 12:25                                                               ` Robert Pluim
2023-06-02 12:58                                                                 ` Eli Zaretskii
2023-06-02 13:58                                                                   ` Robert Pluim
2023-06-03  5:36                                                                     ` Eli Zaretskii
2023-06-05 13:08                                                                       ` Robert Pluim
2023-06-05 13:12                                                                         ` Eli Zaretskii
2023-06-05 13:31                                                                           ` Eli Zaretskii
2023-06-05 14:06                                                                             ` Robert Pluim
2023-06-05 13:36                                                                           ` Robert Pluim
2023-06-05 13:47                                                                             ` Eli Zaretskii
2023-06-05 14:27                                                                               ` Robert Pluim
2023-06-05 15:35                                                                                 ` Eli Zaretskii
2023-06-05 15:57                                                                                   ` Robert Pluim
2023-06-05 16:20                                                                                     ` Robert Pluim
2023-06-05 16:41                                                                                       ` Eli Zaretskii
2023-06-06  7:24                                                                                         ` Robert Pluim
2023-06-05 16:39                                                                                     ` Eli Zaretskii
2023-06-06  7:28                                                                                       ` Robert Pluim
2023-06-06 11:53                                                                                         ` Eli Zaretskii
2023-05-26 15:06   ` Steven Allen
2023-05-26 15:29     ` Robert Pluim
2023-05-26 16:03       ` Steven Allen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).