From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#31315: wrong font encoding for fallback font Date: Wed, 02 May 2018 18:22:50 +0300 Message-ID: <83wowmozed.fsf@gnu.org> References: <83a7tksp6b.fsf@gnu.org> <20180501.083644.1367158406383319333.wl@gnu.org> <83muxjqu2e.fsf@gnu.org> <20180501.213014.1436609899151985328.wl@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1525274528 6729 195.159.176.226 (2 May 2018 15:22:08 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 2 May 2018 15:22:08 +0000 (UTC) Cc: 31315@debbugs.gnu.org To: Werner LEMBERG Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed May 02 17:22:03 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fDtZz-0001gB-BE for geb-bug-gnu-emacs@m.gmane.org; Wed, 02 May 2018 17:22:03 +0200 Original-Received: from localhost ([::1]:51054 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fDtc6-0005Er-1U for geb-bug-gnu-emacs@m.gmane.org; Wed, 02 May 2018 11:24:14 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41941) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fDtbz-0005Cv-Ve for bug-gnu-emacs@gnu.org; Wed, 02 May 2018 11:24:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fDtbu-00057L-Su for bug-gnu-emacs@gnu.org; Wed, 02 May 2018 11:24:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:39296) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fDtbu-00057D-OZ for bug-gnu-emacs@gnu.org; Wed, 02 May 2018 11:24:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1fDtbu-0003gI-D2 for bug-gnu-emacs@gnu.org; Wed, 02 May 2018 11:24:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 02 May 2018 15:24:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 31315 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 31315-submit@debbugs.gnu.org id=B31315.152527459514086 (code B ref 31315); Wed, 02 May 2018 15:24:02 +0000 Original-Received: (at 31315) by debbugs.gnu.org; 2 May 2018 15:23:15 +0000 Original-Received: from localhost ([127.0.0.1]:47193 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fDtb8-0003f7-Kb for submit@debbugs.gnu.org; Wed, 02 May 2018 11:23:14 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:50779) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fDtb7-0003et-GG for 31315@debbugs.gnu.org; Wed, 02 May 2018 11:23:13 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fDtb1-0004pL-Ar for 31315@debbugs.gnu.org; Wed, 02 May 2018 11:23:08 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:49075) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fDtas-0004ly-3f; Wed, 02 May 2018 11:22:58 -0400 Original-Received: from [176.228.60.248] (port=4628 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1fDtar-0005am-HB; Wed, 02 May 2018 11:22:57 -0400 In-reply-to: <20180501.213014.1436609899151985328.wl@gnu.org> (message from Werner LEMBERG on Tue, 01 May 2018 21:30:14 +0200 (CEST)) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:145933 Archived-At: > Date: Tue, 01 May 2018 21:30:14 +0200 (CEST) > Cc: handa@gnu.org, 31315@debbugs.gnu.org > From: Werner LEMBERG > > > I think we have a terminology problem here, most probably my fault. > > What exactly do you mean when you say "font backend" in this > > context? And what is "the client" in this case? > > OK, sorry. I mean the X11 font backend. Here's my global picture. > > gb18030 unicode > Emacs -----------> xft ------------> DroidSansFallback.ttf > > For me, Emacs is a client of the xft font interface. In our > particular case, xft provides `DroidSansFallback.ttf' to Emacs as a > font encoded in GB18030 – Emacs obviously has requested a font in this > encoding. Behind the scenes, however, xft communicates with the > `DroidSansFallback.ttf' font using Unicode (the font has no other > cmap). If by "xft" you mean the part of the X libraries that supports the APIs used by xfont.c, then I think we are on the same page now. > > If you received a GB18030 encoded email, it is expected that Emacs > > will try to find a font that explicitly supports GB18030. > > > > This is a feature that AFAIU is very important to CJK users: they > > expect Emacs to select a font that declares support for the > > character's charset as set by the decoding machinery. > > While this is correct for other CJK encodings like GB, JIS, KSC, or > Big5, it is *not* true for GB18030. This is *only* an encoding and > *not* a charset! It is simply another representation of Unicode, > comparable to UTF-8 or UCS4. There doesn't exist a single font > natively encoded in GB18030! This encoding only exists to be > code-wise backward compatible with GB 2312. Maybe so, but GB18030 is a Chinese encoding, and as such it behaves in Emacs as all the other Chinese encodings. Emacs employs that logic for every charset it has defined, including Latin-2, for example: if text was decoded from an encoding which supports a particular charset, Emacs puts the corresponding 'charset' text property on the decoded text, and the machinery which selects the appropriate font tries first to find a font which supports that charset. The idea is that users in a particular culture have certain distinct preferences wrt fonts, and that an encoding that supports a certain charset or culture provides a hint about those preferences. This idea is very central in how Emacs selects fonts. > To a certain extent it is valid to assume that a user of GB18030 > expects Chinese glyph representation forms for characters in the CJK > range. However, since full Unicode is supported, this assumption is > rather weak. Weak or not, Emacs tries to heed it. > >> I don't have any fontsets customized in my `.emacs' file. > > > > Well, it sounds like you should. Emacs chooses fonts using > > techniques that prefer speed to accuracy, and if that gives > > suboptimal results, the way to improve them is to guide Emacs by > > tailoring your fontset to the fonts you have installed and to the > > visual appearance you happen to like. > > For the purpose of reporting this bug I thought it would be best to > not use further deviations of `emacs -Q'... My comment was not in the context of the bug report (where your assumption is absolutely correct), it is rather a response to your broader complain regarding an ugly font that creeps into display of text which was encoded in GB18030. You can tell Emacs to use other fonts for that charset by customizing your fontset. > >> Both. If I open a new file Unicode encoded file, Emacs continues > >> to use GB18030.2000 as the charset registry/encoding for displaying > >> fallback characters, failing to convert Unicode to GB18030 before > >> accessing the characters from the font backend. > > > > The former part is not a bug at all. > > I agree. I only wanted to tell you what I observe. Well, you called that a "problem". I understand that we now agree the first part is not a problem in itself.