From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#31315: wrong font encoding for fallback font Date: Tue, 01 May 2018 18:22:49 +0300 Message-ID: <83muxjqu2e.fsf@gnu.org> References: <20180430.092106.1639809980149388597.wl@gnu.org> <83a7tksp6b.fsf@gnu.org> <20180501.083644.1367158406383319333.wl@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1525188127 26312 195.159.176.226 (1 May 2018 15:22:07 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 1 May 2018 15:22:07 +0000 (UTC) Cc: 31315@debbugs.gnu.org To: Werner LEMBERG Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue May 01 17:22:03 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fDX6R-0006lZ-7d for geb-bug-gnu-emacs@m.gmane.org; Tue, 01 May 2018 17:22:03 +0200 Original-Received: from localhost ([::1]:43038 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fDX8Y-00021q-6b for geb-bug-gnu-emacs@m.gmane.org; Tue, 01 May 2018 11:24:14 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:35186) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fDX8P-00021W-H1 for bug-gnu-emacs@gnu.org; Tue, 01 May 2018 11:24:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fDX8M-0003Ql-AP for bug-gnu-emacs@gnu.org; Tue, 01 May 2018 11:24:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:37896) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fDX8M-0003Qg-6s for bug-gnu-emacs@gnu.org; Tue, 01 May 2018 11:24:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1fDX8M-0005PG-0a for bug-gnu-emacs@gnu.org; Tue, 01 May 2018 11:24:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 01 May 2018 15:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 31315 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 31315-submit@debbugs.gnu.org id=B31315.152518820220734 (code B ref 31315); Tue, 01 May 2018 15:24:01 +0000 Original-Received: (at 31315) by debbugs.gnu.org; 1 May 2018 15:23:22 +0000 Original-Received: from localhost ([127.0.0.1]:45793 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fDX7h-0005OM-Q3 for submit@debbugs.gnu.org; Tue, 01 May 2018 11:23:22 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:44083) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fDX7f-0005O2-MP for 31315@debbugs.gnu.org; Tue, 01 May 2018 11:23:20 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fDX7X-0003IB-9q for 31315@debbugs.gnu.org; Tue, 01 May 2018 11:23:14 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:57051) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fDX7M-0003Em-89; Tue, 01 May 2018 11:23:00 -0400 Original-Received: from [176.228.60.248] (port=2976 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1fDX7L-0006yH-Ik; Tue, 01 May 2018 11:23:00 -0400 In-reply-to: <20180501.083644.1367158406383319333.wl@gnu.org> (message from Werner LEMBERG on Tue, 01 May 2018 08:36:44 +0200 (CEST)) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:145895 Archived-At: > Date: Tue, 01 May 2018 08:36:44 +0200 (CEST) > Cc: handa@gnu.org, 31315@debbugs.gnu.org > From: Werner LEMBERG > > > And I think you might be mistaken in your interpretation of what > > "gb18030.2000" in the font name means: I think it's the font registry, > > not its encoding. > > Yes, but the font registry implies the used encoding to access the > font. Having said that, you seem to contradict yourself right away: > The real encoding of the font is irrelevant (the Droid Sans Fallback > font is a standard TrueType font that has only a Unicode cmap); So I still think we may be miscommunicating. > what matters is how the font backend provides the font to the > client. Calling `xlsfonts' I see that X11 offers access as follows. > > -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-1 > -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-2 > -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-3 > -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb18030.2000-0 > -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb2312.1980-0 > -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-iso10646-1 > -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0201.1976-0 > -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1983-0 > -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0 I think we have a terminology problem here, most probably my fault. What exactly do you mean when you say "font backend" in this context? And what is "the client" in this case? I'm afraid using xlsfonts doesn't help me understand what am I missing, because I have only a vague idea of what that command does, beyond the basic fact that it lists fonts. > > What we put in font-encoding-alist now was a deliberate change in > > Jan 2008, in response to a bug report; see > > > > http://lists.gnu.org/archive/html/emacs-devel/2008-01/msg00754.html > > > > If fonts like this one need to have characters encoded by gb18030, > > then I think we need to change what the value says. > > As can be seen above, the font itself doesn't need GB18030. It's the > font backend that provides this encoding, and Emacs accesses it. In my terminology, "font backend" is in Emacs (xfont.c, xftfont.c, etc.), and the encoding happens in the backend, guided by font-encoding-alist, among other things. And your OP vs the experiment with changing font-encoding-alist clearly shows that encoding characters correctly for the xfont backend _is_ required to display the correct glyphs with fonts handled by that backend. > > But this area in Emacs is under-documented, so I'm not sure I've > > got it right, in particular what is the effect of ENCODING and > > REPERTORY in this context. For most font back-ends, ENCODING is > > ignored, because the back-end is capable to encode the character we > > hand to it. But the xfont back-end indeed uses Emacs's encoding > > functions to do that externally to the corresponding X APIs. Which > > might explain why this problem, if indeed we fail to specify the > > correct encoding for this charset, was never reported till now: > > xfont is rarely if ever used. > > Emacs doesn't fail to specify the correct encoding. The problem is > that it feeds the font backend with characters in the wrong encoding > (namely Unicode instead of GB 18030). "Fails to specify the correct encoding" is the reason why it uses wrong encoding for the characters in the font backend xfont.c. I believe this is again a terminology problem. > >> It's a completely different question why on my system Emacs uses a > >> font encoded in GB 18030 as a fallback font. It's probably related > >> to the fact that I use `mew' as my e-mail program, manually > >> extended to cover GB 18030. Unfortunately, I wasn't able yet to > >> trigger the issue with `emacs -Q' (which by default uses iso10646 > >> for the fallback font). > > > > Well, we cannot try helping you to unlock this unless you tell how > > you "manually extended" Emacs. > > Oh, I haven't extended Emacs, sorry for the bad wording. I've simply > added a line to mew's elisp code to make it recognize GB18030 in > e-mails. If you received a GB18030 encoded email, it is expected that Emacs will try to find a font that explicitly supports GB18030. This is a feature that AFAIU is very important to CJK users: they expect Emacs to select a font that declares support for the character's charset as set by the decoding machinery. > > In general, the way to request that Emacs uses fonts you like with > > certain characters or charsets is by customizing your fontsets. I > > cannot say more without hearing the details. > > I don't have any fontsets customized in my `.emacs' file. Well, it sounds like you should. Emacs chooses fonts using techniques that prefer speed to accuracy, and if that gives suboptimal results, the way to improve them is to guide Emacs by tailoring your fontset to the fonts you have installed and to the visual appearance you happen to like. > >> On the other hand, as soon as the problem happens, it happens with > >> any buffer containing CJK characters not displayable with the > >> current font, so it seems a genuine Emacs core bug. > > > > What "problem" do you allude to here? The first (seemingly > > incorrect encoding) or the second (fallback to this particular > > font)? > > Both. If I open a new file Unicode encoded file, Emacs continues to > use GB18030.2000 as the charset registry/encoding for displaying > fallback characters, failing to convert Unicode to GB18030 before > accessing the characters from the font backend. The former part is not a bug at all. When Emacs needs to display a character that is not supported by the frame's default font, it first tries all the fonts it already has loaded, before it searches the rest of the fonts on your system. So once the GB18030.2000 font is loaded, Emacs will use it for any character not supported by other loaded fonts. Or did I miss something?