unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Werner LEMBERG <wl@gnu.org>
To: eliz@gnu.org
Cc: 31315@debbugs.gnu.org
Subject: bug#31315: wrong font encoding for fallback font
Date: Tue, 01 May 2018 08:36:44 +0200 (CEST)	[thread overview]
Message-ID: <20180501.083644.1367158406383319333.wl@gnu.org> (raw)
In-Reply-To: <83a7tksp6b.fsf@gnu.org>


> And I think you might be mistaken in your interpretation of what
> "gb18030.2000" in the font name means: I think it's the font registry,
> not its encoding.

Yes, but the font registry implies the used encoding to access the
font.

> How sure are you that the encoding of this font is indeed
> gb18030.2000?

Quite sure.  To be more precise: The real encoding of the font is
irrelevant (the Droid Sans Fallback font is a standard TrueType font
that has only a Unicode cmap); what matters is how the font backend
provides the font to the client.  Calling `xlsfonts' I see that X11
offers access as follows.

  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-1
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-2
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-3
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb18030.2000-0
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb2312.1980-0
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-iso10646-1
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0201.1976-0
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1983-0
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0

>> The problem now is that the encoding of the fallback font is not
>> respected.  In the image, the highlighted character is U+83EF, but
>> Emacs incorrectly displays U+51BF instead.
>>
>> The GB 18030 bytes to represent U+51BF are \x83\xEF; this clearly
>> shows that Emacs lacks an iconv call (or an equivalent to that);
>> instead, it seems to simply feed the Unicode value to the font
>> backend.
>
> Tz-tz-tz, how can you even suggest something like that about Emacs ;-)
>
> If you look in xfont_encode_char, you will see that it does encode
> the character before handing it to the font-drawing function.  But I
> see that font-encoding-alist has this to say about gb18030:
>
>  ("gb18030" unicode)
>
> Does replacing that with something like this:
>
>  ("gb18030" (gb18030 . unicode))
>
> solve the problem?

Yes, it seems so.

> What we put in font-encoding-alist now was a deliberate change in
> Jan 2008, in response to a bug report; see
>
>   http://lists.gnu.org/archive/html/emacs-devel/2008-01/msg00754.html
>
> If fonts like this one need to have characters encoded by gb18030,
> then I think we need to change what the value says.

As can be seen above, the font itself doesn't need GB18030.  It's the
font backend that provides this encoding, and Emacs accesses it.

> But this area in Emacs is under-documented, so I'm not sure I've
> got it right, in particular what is the effect of ENCODING and
> REPERTORY in this context.  For most font back-ends, ENCODING is
> ignored, because the back-end is capable to encode the character we
> hand to it.  But the xfont back-end indeed uses Emacs's encoding
> functions to do that externally to the corresponding X APIs.  Which
> might explain why this problem, if indeed we fail to specify the
> correct encoding for this charset, was never reported till now:
> xfont is rarely if ever used.

Emacs doesn't fail to specify the correct encoding.  The problem is
that it feeds the font backend with characters in the wrong encoding
(namely Unicode instead of GB 18030).

>> It's a completely different question why on my system Emacs uses a
>> font encoded in GB 18030 as a fallback font.  It's probably related
>> to the fact that I use `mew' as my e-mail program, manually
>> extended to cover GB 18030.  Unfortunately, I wasn't able yet to
>> trigger the issue with `emacs -Q' (which by default uses iso10646
>> for the fallback font).
>
> Well, we cannot try helping you to unlock this unless you tell how
> you "manually extended" Emacs.

Oh, I haven't extended Emacs, sorry for the bad wording.  I've simply
added a line to mew's elisp code to make it recognize GB18030 in
e-mails.

> In general, the way to request that Emacs uses fonts you like with
> certain characters or charsets is by customizing your fontsets.  I
> cannot say more without hearing the details.

I don't have any fontsets customized in my `.emacs' file.

>> On the other hand, as soon as the problem happens, it happens with
>> any buffer containing CJK characters not displayable with the
>> current font, so it seems a genuine Emacs core bug.
>
> What "problem" do you allude to here?  The first (seemingly
> incorrect encoding) or the second (fallback to this particular
> font)?

Both.  If I open a new file Unicode encoded file, Emacs continues to
use GB18030.2000 as the charset registry/encoding for displaying
fallback characters, failing to convert Unicode to GB18030 before
accessing the characters from the font backend.


    Werner





  parent reply	other threads:[~2018-05-01  6:36 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-30  7:21 bug#31315: wrong font encoding for fallback font Werner LEMBERG
2018-04-30 15:13 ` Eli Zaretskii
2018-04-30 15:42   ` Andreas Schwab
2018-04-30 19:26     ` Eli Zaretskii
2018-04-30 20:03       ` Andreas Schwab
2018-05-01  2:37         ` Eli Zaretskii
2018-05-01  6:47         ` Werner LEMBERG
2018-05-01  8:13           ` Andreas Schwab
2018-05-01  9:11             ` Werner LEMBERG
2018-05-01 15:00               ` Eli Zaretskii
2018-05-01 17:42                 ` Andreas Schwab
2018-05-05  8:57                   ` Eli Zaretskii
2018-05-01  6:36   ` Werner LEMBERG [this message]
2018-05-01 15:22     ` Eli Zaretskii
2018-05-01 19:30       ` Werner LEMBERG
2018-05-02  7:27         ` Werner LEMBERG
2018-05-02 15:22         ` Eli Zaretskii
2018-05-03  5:52           ` Werner LEMBERG
2018-05-03 17:48             ` Eli Zaretskii
2018-05-03 19:05               ` Werner LEMBERG
2018-05-03 19:59                 ` Eli Zaretskii
2018-05-04  5:11                   ` Werner LEMBERG
2018-05-04 13:05                     ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180501.083644.1367158406383319333.wl@gnu.org \
    --to=wl@gnu.org \
    --cc=31315@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).