From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Robert Pluim Newsgroups: gmane.emacs.devel Subject: Re: Better emoji support Date: Sun, 19 Sep 2021 21:13:06 +0200 Message-ID: <87a6k8l6pp.fsf@gmail.com> References: <834kd2cypw.fsf@gnu.org> <87zguuttbm.fsf@gmail.com> <8335smcxx6.fsf@gnu.org> <87v95itsc4.fsf@gmail.com> <831r86cxdy.fsf@gnu.org> <83a6kgejp0.fsf@gnu.org> <87wnnkpjj9.fsf@gmail.com> <3E0155F6-D681-4443-A1D9-472D1836168D@traduction-libre.org> <87bl4rnyoe.fsf@gmail.com> <87tuigmyez.fsf@mail.linkov.net> <87bl4ozds2.fsf@gmail.com> <87mto8l95x.fsf@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27804"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Jean-Christophe Helary , Eli Zaretskii , Juri Linkov , emacs-devel@gnu.org To: =?utf-8?Q?K=C3=A9vin?= Le Gouguec Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Sep 19 21:13:47 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mS2G3-000737-9n for ged-emacs-devel@m.gmane-mx.org; Sun, 19 Sep 2021 21:13:47 +0200 Original-Received: from localhost ([::1]:49652 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mS2G2-0004Z6-3c for ged-emacs-devel@m.gmane-mx.org; Sun, 19 Sep 2021 15:13:46 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:34588) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mS2FT-0003so-Mb for emacs-devel@gnu.org; Sun, 19 Sep 2021 15:13:11 -0400 Original-Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]:33438) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mS2FR-00042u-Pi; Sun, 19 Sep 2021 15:13:11 -0400 Original-Received: by mail-wr1-x42a.google.com with SMTP id t18so24825531wrb.0; Sun, 19 Sep 2021 12:13:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:references:gmane-reply-to-list:date:in-reply-to :message-id:mime-version; bh=kYLHECYutTRoLH9TV1V306lu8CFjYmZEitA/jTa6A/g=; b=Q8tZxAAdFWJOq7y2blTFwuisTADj8VWQEJ/WxuQJ7s/Z3Zet6J4oIEPte64C/WDqU8 xdfCxleg0O8bx5NXjMhxC8EDpaTks5ZsLQrpUWrUBrcnKAJRao49uPkGu7oc3p45xHXU BsXxCWOUtUW2p5VpbDet9T/B33cRgLaLhXsIl8G09ANduhcAoKZeO5BLa82AlIjOwP5e 6aTdsG9yBLtygpiV0L8Be8Ph+PbpRDVO9OZS/MVLcoWp1sqmb4Qg9Pkv9ipEtRanu8UW wj0C8tfeF7BHaybiAuEV+Cq8MtcTLYJcEsJVljtgIIQnDrtJtIK4Z4ufnXrzzfaHB0al s+jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:references :gmane-reply-to-list:date:in-reply-to:message-id:mime-version; bh=kYLHECYutTRoLH9TV1V306lu8CFjYmZEitA/jTa6A/g=; b=PGEzEUbG6TT17x1I6vUPx71bKv3YvaV6qmQcpdMUiVE0vc60Fa88h6ZyhTgswTXWIF KzdXoQ3lWF9oeT4dQIHFXNj0tP11vPNZ50DmUfosCHx1dI0fjNzyt9P1e436jWmsWjC3 ZZHN+j0XULfYuganmWVQ+P1P3cIdYy0vBUl20VQbZLKW5s57qh2Q2Cgt0Co2VynZbhff /kw3i1L2W9qzl8vu2jnMI9KSJXBMbFlyjxhKlEKOsYSgxCA06mHuNvdWam+LbSlpzgBN 1j6VbG7IXy0giiW7b/c5O+PMt/G5lWy7Fhe6PpzLUIIcce3U+UvlmRdX0X6OtMeSr8Hb Fqwg== X-Gm-Message-State: AOAM530Ee9UgXCtmiycCANdrHuz+6ttFk0/BwE/em6FhVbuRfpKTrAX5 yQVKXHNCWAs7nkrxE8H9xQc= X-Google-Smtp-Source: ABdhPJwuglNIZ++UUJ7aESo2A5vcanhr4HM0ahYE3zcYRYi7EZgmpCBhZEGFI/NMXK0uNjmaaRXMwQ== X-Received: by 2002:a05:6000:1446:: with SMTP id v6mr13264654wrx.427.1632078788020; Sun, 19 Sep 2021 12:13:08 -0700 (PDT) Original-Received: from rltb ([82.66.8.55]) by smtp.gmail.com with ESMTPSA id j23sm14631884wmo.14.2021.09.19.12.13.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 19 Sep 2021 12:13:07 -0700 (PDT) Gmane-Reply-To-List: yes In-Reply-To: <87mto8l95x.fsf@gmail.com> (Robert Pluim's message of "Sun, 19 Sep 2021 20:20:10 +0200") Received-SPF: pass client-ip=2a00:1450:4864:20::42a; envelope-from=rpluim@gmail.com; helo=mail-wr1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:275079 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable >>>>> On Sun, 19 Sep 2021 20:20:10 +0200, Robert Pluim s= aid: >>>>> On Sun, 19 Sep 2021 19:16:45 +0200, K=C3=A9vin Le Gouguec said: K=C3=A9vin> Juri Linkov writes: >>> Thanks, this is a nice change. I have a problem only with one char= acter. >>> Displaying NORTH EAST ARROW =E2=86=97 with emoji font is inappropri= ate to me. >>>=20 >>> If adding this character to the emoji script was not a mistake, >>> is it possible to customize this to display such characters >>> as a symbol like before? K=C3=A9vin> admin/unidata/emoji-data.txt contains a whole section where= "omitted K=C3=A9vin> code points have Emoji_Presentation=3DNo", and AFAICT NORTH= EAST ARROW K=C3=A9vin> is one of those omitted code points. Robert> True. But it has the 'Emoji' property. K=C3=A9vin> IIUC, the Unicode standard means for those omitted characte= rs to display K=C3=A9vin> as "text" rather than as "emoji", unless they are followed = by a K=C3=A9vin> variation selector[1]; maybe some adjustments in fontset.el= are in K=C3=A9vin> order? Eli also pointed out that this should depend on the presence of the variation selector. I=CA=BCll add that to the work I=CA=BCm doing on emoji = ZWNJ sequences anyway (although given that the selector and the codepoints will likely be using different fonts, I=CA=BCm not sure we'll be able to achieve this) Robert> As "text presentation" rather than "emoji presentation", not as Robert> "text". If we want to follow the Emoji_Presentation property, t= hat=CA=BCs Robert> easy enough to arrange. As per the attached. If it=CA=BCs all OK I=CA=BCll push tomorrow. Robert --=20 --=-=-= Content-Type: text/x-diff Content-Disposition: attachment; filename=0001-Base-emoji-script-membership-on-Emoji_Presentation.patch >From 3f7f622346f1c14bd5aa7e373e5f634a71b318ad Mon Sep 17 00:00:00 2001 From: Robert Pluim Date: Sun, 19 Sep 2021 21:07:36 +0200 Subject: [PATCH] Base emoji script membership on Emoji_Presentation To: emacs-devel@gnu.org The Emoji property describes which codepoints can be displayed as emoji, but Emoji_Presentation governs which are displayed as emoji by default. * admin/notes/unicode: Adjust check-emoji-coverage to look in the Emoji_Presentation sections of emoji-data.txt * admin/unidata/blocks.awk: Assign emoji script using the Emoji_Presentation section. --- admin/notes/unicode | 2 +- admin/unidata/blocks.awk | 7 +------ 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/admin/notes/unicode b/admin/notes/unicode index 9dc6f3bdca..0b2ce52794 100644 --- a/admin/notes/unicode +++ b/admin/notes/unicode @@ -100,7 +100,7 @@ FONT-NAME-REGEXP is checked using `string-match'." (save-excursion (goto-char (point-min)) (let (res char name ifont) - (while (re-search-forward "; Emoji [^(]+(\\(.\\)[).\uFE0F]" nil t) + (while (re-search-forward "; Emoji_Presentation [^(]+(\\(.\\)[).]" nil t) (setq char (aref (match-string 1) 0)) (setq ifont (car (internal-char-font nil char))) (when ifont diff --git a/admin/unidata/blocks.awk b/admin/unidata/blocks.awk index 6e52b52f67..29022bf7dd 100755 --- a/admin/unidata/blocks.awk +++ b/admin/unidata/blocks.awk @@ -202,12 +202,7 @@ FILENAME ~ "Blocks.txt" && /^[0-9A-F]/ { } } -# The space after 'Emoji' is significant in the next two rules. -# This purposely and deliberately excludes codepoints <= 00FF -FILENAME ~ "emoji-data.txt" && /^00[0-9A-F][0-9A-F].*; Emoji / { - next -} -FILENAME ~ "emoji-data.txt" && /^[0-9A-F].*; Emoji / { +FILENAME ~ "emoji-data.txt" && /^[0-9A-F].*; Emoji_Presentation / { sep = index($1, "..") len = length($1) if (sep > 0) { -- 2.33.0.363.g4c719308ce --=-=-=--