From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Werner LEMBERG Newsgroups: gmane.emacs.bugs Subject: bug#31315: wrong font encoding for fallback font Date: Tue, 01 May 2018 08:36:44 +0200 (CEST) Message-ID: <20180501.083644.1367158406383319333.wl@gnu.org> References: <20180430.092106.1639809980149388597.wl@gnu.org> <83a7tksp6b.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: blaine.gmane.org 1525156504 1001 195.159.176.226 (1 May 2018 06:35:04 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 1 May 2018 06:35:04 +0000 (UTC) Cc: 31315@debbugs.gnu.org To: eliz@gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue May 01 08:35:00 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fDOsN-00006m-IX for geb-bug-gnu-emacs@m.gmane.org; Tue, 01 May 2018 08:34:59 +0200 Original-Received: from localhost ([::1]:34979 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fDOuU-0004eB-CI for geb-bug-gnu-emacs@m.gmane.org; Tue, 01 May 2018 02:37:10 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:43241) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fDOuO-0004e5-1g for bug-gnu-emacs@gnu.org; Tue, 01 May 2018 02:37:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fDOuM-0002j1-Ly for bug-gnu-emacs@gnu.org; Tue, 01 May 2018 02:37:04 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:36746) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fDOuM-0002is-Ir for bug-gnu-emacs@gnu.org; Tue, 01 May 2018 02:37:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1fDOuM-0007lI-6Q for bug-gnu-emacs@gnu.org; Tue, 01 May 2018 02:37:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Werner LEMBERG Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 01 May 2018 06:37:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 31315 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 31315-submit@debbugs.gnu.org id=B31315.152515661929828 (code B ref 31315); Tue, 01 May 2018 06:37:02 +0000 Original-Received: (at 31315) by debbugs.gnu.org; 1 May 2018 06:36:59 +0000 Original-Received: from localhost ([127.0.0.1]:44643 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fDOuG-0007kw-6P for submit@debbugs.gnu.org; Tue, 01 May 2018 02:36:59 -0400 Original-Received: from mout.gmx.net ([212.227.15.15]:48231) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fDOuE-0007kj-Eo for 31315@debbugs.gnu.org; Tue, 01 May 2018 02:36:54 -0400 Original-Received: from gnu.org ([217.149.168.69]) by mail.gmx.com (mrgmx002 [212.227.17.190]) with ESMTPSA (Nemesis) id 0Lt1S6-1eCNqX2wTC-012WlW; Tue, 01 May 2018 08:36:48 +0200 In-Reply-To: <83a7tksp6b.fsf@gnu.org> X-Mailer: Mew version 6.7 on Emacs 27.0.50 / Mule 6.0 (HANACHIRUSATO) X-Provags-ID: V03:K1:9Q2L3yeH6m6juDBgPoy0yxiJII+uo/RKtIpjOUJGTbVFaBNWm0n aieH07oYqLwnNfXqlPMZLP2LFajykdIfEA0p6o4OfNw0TJ/16Y0u3MyZD3fsV/YKbdGtJLZ AEXE742hHVpqfJPufvCmOmrNRbClnhbTdGcl+ie3KDBFiGf8Sx1jpJsCqAETwd1QKepGIf+ SRa96lEo9ZAK6q0Z1HAqQ== X-UI-Out-Filterresults: notjunk:1;V01:K0:yNfv+papTCo=:7XuHN57dQw7NiihWbqc/iD 9aH3XueyUmGGc4thRphrJyKj7QMXp1NU9G04naw/TIM6BvofHMB90W9SX2aYnahCUE9/uXi/V LRZwuQb7rGPnNdE5m8+HKDx92RXqz+2D23znRiE0X6PI9jpbVFRat+BnmojkSomAg/FGvLSjV Rs7Zv5Aavj5XunUxz/eCCG3gh9NOxHIO/loz1AW3hE71ythgEn32PiNUcXnlQcfFRGUcmmird Qdyfh099XiNAGCMdaA9FEkINZlVTROrJW0ZbEaAggVHfBnlykUNc/Zab4vWJD14GE1pzdEPax ssOs+9Z1v36UIl72819XC0LVwubyhnYXg9VHFEEtQ3Xx+PSojQS8XQZQiRnfnHCvQZrtrC72f DaqZwXF0KV8o7uCddYtNYjXQBPtLvSTXEvIWPJUsB1nkEwhRH4OSqnpVsEGFRZHYwAYZ3s5tJ gXa+jlAGJ1OyCYDQdWj3glzvAH2aztr2fbcJWWhVA5G3Qeqg6mkKB5o9THBi7rvuOZaTWNsE3 272kk0tmjJ6kFYvL/eMVwsPef8jWjWMljG53WyBhdjwpuIml8EpYFLktfrsMufqf8Q0FJttJG jatBvMKVm3eIzN+2SkkEtk07s2MT7mUkIdz2HERErBbIzBmbTBL7Iwwe0gFGDNNlO1GVzmq8N xTtYPyJcGHyIoRPiFW0AyORhw32UnvQCwR9mEWox+lJKerWb8v7M7oFM8WPSAyRoKIJE4hzcn ypst3C8TnHZLUoHEjLorVnw3hFJjb/vI3FA/8vPZou2ds16YVYQ4w58wgYoXCdyh9obvhEeV X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:145885 Archived-At: > And I think you might be mistaken in your interpretation of what > "gb18030.2000" in the font name means: I think it's the font registry, > not its encoding. Yes, but the font registry implies the used encoding to access the font. > How sure are you that the encoding of this font is indeed > gb18030.2000? Quite sure. To be more precise: The real encoding of the font is irrelevant (the Droid Sans Fallback font is a standard TrueType font that has only a Unicode cmap); what matters is how the font backend provides the font to the client. Calling `xlsfonts' I see that X11 offers access as follows. -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-1 -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-2 -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-3 -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb18030.2000-0 -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb2312.1980-0 -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-iso10646-1 -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0201.1976-0 -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1983-0 -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0 >> The problem now is that the encoding of the fallback font is not >> respected. In the image, the highlighted character is U+83EF, but >> Emacs incorrectly displays U+51BF instead. >> >> The GB 18030 bytes to represent U+51BF are \x83\xEF; this clearly >> shows that Emacs lacks an iconv call (or an equivalent to that); >> instead, it seems to simply feed the Unicode value to the font >> backend. > > Tz-tz-tz, how can you even suggest something like that about Emacs ;-) > > If you look in xfont_encode_char, you will see that it does encode > the character before handing it to the font-drawing function. But I > see that font-encoding-alist has this to say about gb18030: > > ("gb18030" unicode) > > Does replacing that with something like this: > > ("gb18030" (gb18030 . unicode)) > > solve the problem? Yes, it seems so. > What we put in font-encoding-alist now was a deliberate change in > Jan 2008, in response to a bug report; see > > http://lists.gnu.org/archive/html/emacs-devel/2008-01/msg00754.html > > If fonts like this one need to have characters encoded by gb18030, > then I think we need to change what the value says. As can be seen above, the font itself doesn't need GB18030. It's the font backend that provides this encoding, and Emacs accesses it. > But this area in Emacs is under-documented, so I'm not sure I've > got it right, in particular what is the effect of ENCODING and > REPERTORY in this context. For most font back-ends, ENCODING is > ignored, because the back-end is capable to encode the character we > hand to it. But the xfont back-end indeed uses Emacs's encoding > functions to do that externally to the corresponding X APIs. Which > might explain why this problem, if indeed we fail to specify the > correct encoding for this charset, was never reported till now: > xfont is rarely if ever used. Emacs doesn't fail to specify the correct encoding. The problem is that it feeds the font backend with characters in the wrong encoding (namely Unicode instead of GB 18030). >> It's a completely different question why on my system Emacs uses a >> font encoded in GB 18030 as a fallback font. It's probably related >> to the fact that I use `mew' as my e-mail program, manually >> extended to cover GB 18030. Unfortunately, I wasn't able yet to >> trigger the issue with `emacs -Q' (which by default uses iso10646 >> for the fallback font). > > Well, we cannot try helping you to unlock this unless you tell how > you "manually extended" Emacs. Oh, I haven't extended Emacs, sorry for the bad wording. I've simply added a line to mew's elisp code to make it recognize GB18030 in e-mails. > In general, the way to request that Emacs uses fonts you like with > certain characters or charsets is by customizing your fontsets. I > cannot say more without hearing the details. I don't have any fontsets customized in my `.emacs' file. >> On the other hand, as soon as the problem happens, it happens with >> any buffer containing CJK characters not displayable with the >> current font, so it seems a genuine Emacs core bug. > > What "problem" do you allude to here? The first (seemingly > incorrect encoding) or the second (fallback to this particular > font)? Both. If I open a new file Unicode encoded file, Emacs continues to use GB18030.2000 as the charset registry/encoding for displaying fallback characters, failing to convert Unicode to GB18030 before accessing the characters from the font backend. Werner