From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Better emoji support Date: Sun, 19 Sep 2021 21:29:44 +0300 Message-ID: <83tuig2zc7.fsf@gnu.org> References: <834kd2cypw.fsf@gnu.org> <87zguuttbm.fsf@gmail.com> <8335smcxx6.fsf@gnu.org> <87v95itsc4.fsf@gmail.com> <831r86cxdy.fsf@gnu.org> <83a6kgejp0.fsf@gnu.org> <87wnnkpjj9.fsf@gmail.com> <3E0155F6-D681-4443-A1D9-472D1836168D@traduction-libre.org> <87bl4rnyoe.fsf@gmail.com> <87tuigmyez.fsf@mail.linkov.net> <834kag4gwq.fsf@gnu.org> <87r1dkl9m9.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="38904"; mail-complaints-to="usenet@ciao.gmane.io" Cc: lists@traduction-libre.org, emacs-devel@gnu.org, juri@linkov.net To: Robert Pluim Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Sep 19 20:32:41 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mS1cH-0009xc-Ap for ged-emacs-devel@m.gmane-mx.org; Sun, 19 Sep 2021 20:32:41 +0200 Original-Received: from localhost ([::1]:34008 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mS1cG-0007aa-8P for ged-emacs-devel@m.gmane-mx.org; Sun, 19 Sep 2021 14:32:40 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:58600) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mS1Zd-0005iQ-QS for emacs-devel@gnu.org; Sun, 19 Sep 2021 14:29:57 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:59848) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mS1Zc-00006v-OB; Sun, 19 Sep 2021 14:29:56 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:3150 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mS1Zc-0006JM-AQ; Sun, 19 Sep 2021 14:29:56 -0400 In-Reply-To: <87r1dkl9m9.fsf@gmail.com> (message from Robert Pluim on Sun, 19 Sep 2021 20:10:22 +0200) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:275075 Archived-At: > From: Robert Pluim > Cc: Juri Linkov , lists@traduction-libre.org, > emacs-devel@gnu.org > Date: Sun, 19 Sep 2021 20:10:22 +0200 > > Eli> Hmm... Robert, I see quite a few characters that now belong to the > Eli> emoji script, which shouldn't be there, AFAIU. The above is one of > Eli> them (AFAIK, the Arrows block doesn't belong to Emoji). But there are > Eli> more stark cases, for example: > > The whole block might not, but some of the codepoints do: > > 2194..2199 ; Emoji # E0.6 [6] (↔️..↙️) left-right arrow..down-left arrow Only if followed by a variation selector VS-16, right? > Eli> (aref char-script-table ?#) => emoji > Eli> (aref char-script-table ?0) => emoji > > I donʼt see that here (and itʼs definitely not the > intention). Blocks.awk skips any ASCII codepoints (and those both > evaluate to "latin" here). Could you double-check your > lisp/international/charscript.el? I see them there: (#x0023 #x0023 emoji) ; Autogenerated emoji (#x002A #x002A emoji) ; Autogenerated emoji (#x0030 #x0039 emoji) ; Autogenerated emoji (#x00A9 #x00A9 emoji) ; Autogenerated emoji (#x00AE #x00AE emoji) ; Autogenerated emoji Which corresponds to these lines in emoji-data.txt: 0023 ; Emoji # E0.0 [1] (#️) hash sign 002A ; Emoji # E0.0 [1] (*️) asterisk 0030..0039 ; Emoji # E0.0 [10] (0️..9️) digit zero..digit nine 00A9 ; Emoji # E0.6 [1] (©️) copyright 00AE ; Emoji # E0.6 [1] (®️) registered > Eli> It seems like these characters ended up in the emoji script because > Eli> they should render as emoji when followed by variation selectors? But > Eli> in that case, the place to do this is in composition-function-table, > Eli> if we can, and if we cannot, let's for now decide we don't support > Eli> these sequences, because the cure sounds worse than the disease with > Eli> our current infrastructure. > > Eli> Am I missing something? > > Are now saying that we only want to add to the emoji script those > characters with Emoji_Presentation=Yes? Yes, I think so. Are there any downsides to that?