From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Mike FABIAN Newsgroups: gmane.emacs.bugs Subject: bug#39799: 28.0.50; Most emoji sequences =?UTF-8?Q?don=E2=80=99t?= render correctly Date: Sat, 29 Feb 2020 12:14:28 +0100 Organization: Red Hat Message-ID: References: <83lfongp4p.fsf@gnu.org> <835zfrglu5.fsf@gnu.org> <83wo86g8pg.fsf@gnu.org> <83k146g46x.fsf@gnu.org> <83imjqg1iv.fsf@gnu.org> <837e06foof.fsf@gnu.org> <83y2sme617.fsf@gnu.org> <83k145emk9.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="49190"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: rpluim@gmail.com, 39799@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Feb 29 12:15:14 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1j805R-000Cfo-RX for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 29 Feb 2020 12:15:13 +0100 Original-Received: from localhost ([::1]:59414 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j805Q-0004XK-I9 for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 29 Feb 2020 06:15:12 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:51885) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j805H-0004X8-P4 for bug-gnu-emacs@gnu.org; Sat, 29 Feb 2020 06:15:05 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j805G-0008Ep-5R for bug-gnu-emacs@gnu.org; Sat, 29 Feb 2020 06:15:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:56555) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j805G-0008Ej-2W for bug-gnu-emacs@gnu.org; Sat, 29 Feb 2020 06:15:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1j805F-00045H-T3 for bug-gnu-emacs@gnu.org; Sat, 29 Feb 2020 06:15:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Mike FABIAN Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Feb 2020 11:15:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39799 X-GNU-PR-Package: emacs Original-Received: via spool by 39799-submit@debbugs.gnu.org id=B39799.158297488215655 (code B ref 39799); Sat, 29 Feb 2020 11:15:01 +0000 Original-Received: (at 39799) by debbugs.gnu.org; 29 Feb 2020 11:14:42 +0000 Original-Received: from localhost ([127.0.0.1]:34295 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j804w-00044Q-2X for submit@debbugs.gnu.org; Sat, 29 Feb 2020 06:14:42 -0500 Original-Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:37477 helo=us-smtp-1.mimecast.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j804u-00044A-0z for 39799@debbugs.gnu.org; Sat, 29 Feb 2020 06:14:41 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582974874; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9kzHG5rq4cFDBlbNJZp4DeVoVSdQCew0Zbf+c0XFfIU=; b=PFpnUFqn8LQZb1/mdQPNi3l2qywjGjK0Sq0MLmUw+7zMMp0VW6dxo2iSvb9iEgzNeFWWVe opCyTcoMl0UXa7PjbVKnqcvQFAFg0+/0TT0JchtYm+lZLcgBOO4mtYQUY4L0tSt8JH1mBA Ms5MREgAarTl3H8qfNZWacOVyBqlz9o= Original-Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-191-WqDeapU_PIyXZvY7tXqc6g-1; Sat, 29 Feb 2020 06:14:32 -0500 X-MC-Unique: WqDeapU_PIyXZvY7tXqc6g-1 Original-Received: by mail-wr1-f71.google.com with SMTP id p8so2723793wrw.5 for <39799@debbugs.gnu.org>; Sat, 29 Feb 2020 03:14:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:organization:references:date :in-reply-to:message-id:user-agent:mime-version :content-transfer-encoding; bh=WqIIiTDYIsl1/XVXzwoo82BORzOj4x24Un5uEpAnT/w=; b=k/AzpZJc6qUqd88zEyYalFaqC7vLLYWIPLw6k5pmbo5nlsL7gJ7haOG88ivrP04JJW G0o7D/LnUjXMfLQUVsiYIIwilK6ME49CpEUcFE71BCTlyRqhBSTQd34tvuUA+2dxYjjn tIgA+w+elAlV6t2Q+J7Fhb19Y0pkk4O8nckhVJl5WtCDJ0A1EphfVidov4lkbCqHsoEs t+IB82o7KIWaI2WOPdBjECZQ9SQLUyJS3OXaCAHppjc9UrdUeE6L9TOLrdryhLsxylMC LKsxR3sPU+tsKtkTIOwlcZPcnmaMHHEQmRMQA0TxeGIG70CFAucOZR0j+lkKTXSKuMSH KtTQ== X-Gm-Message-State: APjAAAWMilstXxDrKEVW3p3Jkc8GkhdUGvHwhtqEOZXTE+eW7MyhXPQH A178uLPBMSSsMX5pEU9HMKnhBC90fx9GpxhJYauSMvOXcanHozD6nInVtkTUGwJSa1RygM+jmWO RzX2yfSh3b4vABA== X-Received: by 2002:adf:fe4c:: with SMTP id m12mr10488308wrs.386.1582974871140; Sat, 29 Feb 2020 03:14:31 -0800 (PST) X-Google-Smtp-Source: APXvYqwb8rwP2+jSw/JpMiXaSh3IVn2uZBCKETTguOYCZY+k/IXuepIsiSsbhqqyGuZhouymJygODw== X-Received: by 2002:adf:fe4c:: with SMTP id m12mr10488290wrs.386.1582974870852; Sat, 29 Feb 2020 03:14:30 -0800 (PST) Original-Received: from taka.site (ppp-46-244-214-115.dynamic.mnet-online.de. [46.244.214.115]) by smtp.gmail.com with ESMTPSA id l3sm15328747wrq.62.2020.02.29.03.14.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 29 Feb 2020 03:14:30 -0800 (PST) Original-Received: from taka.site (localhost [IPv6:::1]) by taka.site (Postfix) with ESMTP id E8113900; Sat, 29 Feb 2020 12:14:28 +0100 (CET) X-Face: "'; oPz9V1+<,`}1ZuxRv~EiSusWq*{Yjr"Sdvbhq'?q=2R\\6Y9O/,SAE`{J|6I=|w/sQg< rW_N'E3IV6~f8?\l#Es`]S`mv',PY(`8{$$R?+gLu}Qv/Mn>)?uladFjJ@yl!_p_Jh; 5QxlD6zL:?r IXe4FfK$C^mWhh$o`yt; .r.FLZLQOWBt> (Eli Zaretskii's message of "Sat, 29 Feb 2020 12:04:54 +0200") X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:176667 Archived-At: Eli Zaretskii =E3=81=95=E3=82=93=E3=81=AF=E3=81=8B=E3=81=8D= =E3=81=BE=E3=81=97=E3=81=9F: >> From: Mike FABIAN >> Cc: rpluim@gmail.com, 39799@debbugs.gnu.org >> Date: Sat, 29 Feb 2020 08:59:49 +0100 >>=20 >> Eli Zaretskii =E3=81=95=E3=82=93=E3=81=AF=E3=81=8B=E3=81= =8D=E3=81=BE=E3=81=97=E3=81=9F: >>=20 >> > If Gedit selects a font by looking at more than one codepoint (and I'm >> > not sure this is how it works in Gedit), then Emacs doesn't work that >> > way. >>=20 >> Yes, Gedit does this somehow with pango. It tries to avoid switching >> fonts in places where it would look bad. For example, if you have a >> default font supporting only ASCII and then there is a word containing >> some non-ASCII character like =E2=80=9Cgr=C3=BCn=E2=80=9D it chooses a f= ont containing the >> =E2=80=9C=C3=BC=E2=80=9D for the whole word to avoid the =E2=80=9C=C3=BC= =E2=80=9D looking out of place. > > Well, "somehow" is not enough to see whether we have any additional > work to do in Emacs, because Emacs also tries to achieve that same > goal. There are many different ways to achieve it, though; for > example, Emacs will AFAIK by default not even use a font that could > support ASCII, but not Latin-1 blocks as the default face's font. > > What you say about Gedit makes sense in general, but questions > immediately pop up: how does Gedit define a "word" (Emacs, as you > know, has very a flexible definition that can be controlled from > Lisp), how does it "know" that a word like "gr=C3=BCn" belongs to the sam= e > script (otherwise displaying a character from another script using a > different font, as in, say, "gr=D7=90n" might make sense), etc. Yes, =E2=80=9Cword=E2=80=9D is already too simplified. > IOW, what we need is a detailed description of what Pango does here, > and how does Gedit affect that by configuring its default fonts. Only > then we can reason about the differences between that and what Emacs > does. Yes, you are right, and I think this is very difficult. I don=E2=80=99t know the details, but Pango seems to =E2=80=9Ccut=E2=80=9D = text into =E2=80=9Cruns=E2=80=9D where each =E2=80=9Crun=E2=80=9D is rendered with a single font. And it tri= es to cut the text into =E2=80=9Cruns=E2=80=9D in a way that the overall result l= ooks as nice as possible. This is really difficult and doesn=E2=80=99t always work well, sometimes the results are ugly although overall it seems to do a good job. >> > In any case, are these sequences displayed as composed characters? >> > Does "C-u C-x =3D" tell that the base character U+24C2 was composed wi= th >> > the following variation selector? According to the setup in >> > japanese.el, they should compose, if the font used for U+24C2 also >> > supports the variation selectors. >>=20 >> Yes, it does tell that it was composed with the following character: > > And the resulting display is what you expect? If not, then I think > you need to find a font which supports Emoji presentation of > characters such as =E2=93=82, and make Emacs use it for those sequences. Yes, in the case of =E2=93=82=EF=B8=8F U+24C2 U+FE0F the result in Emacs is= perfect when using =E2=80=9CNoto Color Emoji=E2=80=9D or =E2=80=9CJoypixels=E2=80= =9D. It is displayed in colour and behaves as a single character in the buffer, the variation selector is not displayed as a box. This is perfect. But when using Symbola for the same sequence one sees U+FE0F as an ugly box. And when displaying the text representation sequence =E2=93=82=EF=B8=8E U+2= 4C2 U+FE0E one always sees U+FE0E as a box no matter whether using =E2=80=9CSymbola=E2= =80=9D, =E2=80=9CNoto Color Emoji=E2=80=9D or =E2=80=9CJoypixels=E2=80=9D. I am not sure whether this is wrong. Maybe it is OK to require a font which can handle this? I am really not sure... But what about # U+0023 NUMBER SIGN ? This does have an emoji representation. I.e. U+0023 U+FE0F displays in color as an emoji in pango-view and gedit. How could this ever work in Emacs? If you have to decide for a single font to render U+0023 in Emacs, you would need to set a =E2=80=9Ccapable=E2= =80=9D emoji font for an ASCII character like #. One probably does not want to do that. Then # in text representation would look different in style than the other ASCII characters because it would come as the text representation glyph from some emoji font which would probably not go well together with other ASCII characters coming from some font like for example =E2=80=9CDejaVu Sans Mono=E2=80=9D. So one probably wants to se= t something like =E2=80=9CDejaVu Sans Mono=E2=80=9D for # as well, otherwise = normal text won=E2=80=99t look nice. But how can one display U+0023 U+FE0F as am emoji = then? This seems very messy, I don=E2=80=99t know how this can be solved. > If you think this Emacs requirement for a capable font is incorrect, I > suggest to post a question about this to the HarfBuzz mailing list, > harfbuzz@lists.freedesktop.org, maybe HarfBuzz has capabilities in > this regard that we somehow don't yet utilize. Yes, I=E2=80=99ll try that, maybe that helps to understand it better. --=20 Mike FABIAN =E7=9D=A1=E7=9C=A0=E4=B8=8D=E8=B6=B3=E3=81=AF=E3=81=84=E3=81=84=E4=BB=95=E4= =BA=8B=E3=81=AE=E6=95=B5=E3=81=A0=E3=80=82