unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Werner LEMBERG <wl@gnu.org>
To: eliz@gnu.org
Cc: 31315@debbugs.gnu.org
Subject: bug#31315: wrong font encoding for fallback font
Date: Tue, 01 May 2018 21:30:14 +0200 (CEST)	[thread overview]
Message-ID: <20180501.213014.1436609899151985328.wl@gnu.org> (raw)
In-Reply-To: <83muxjqu2e.fsf@gnu.org>

>> what matters is how the font backend provides the font to the
>> client.  Calling `xlsfonts' I see that X11 offers access as
>> follows.
>>
>>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-1
>>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-2
>>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-3
>>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb18030.2000-0
>>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb2312.1980-0
>>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-iso10646-1
>>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0201.1976-0
>>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1983-0
>>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0
>
> I think we have a terminology problem here, most probably my fault.
> What exactly do you mean when you say "font backend" in this
> context?  And what is "the client" in this case?

OK, sorry.  I mean the X11 font backend.  Here's my global picture.

          gb18030               unicode
 Emacs  ----------->   xft   ------------>  DroidSansFallback.ttf

For me, Emacs is a client of the xft font interface.  In our
particular case, xft provides `DroidSansFallback.ttf' to Emacs as a
font encoded in GB18030 – Emacs obviously has requested a font in this
encoding.  Behind the scenes, however, xft communicates with the
`DroidSansFallback.ttf' font using Unicode (the font has no other
cmap).

> If you received a GB18030 encoded email, it is expected that Emacs
> will try to find a font that explicitly supports GB18030.
>
> This is a feature that AFAIU is very important to CJK users: they
> expect Emacs to select a font that declares support for the
> character's charset as set by the decoding machinery.

While this is correct for other CJK encodings like GB, JIS, KSC, or
Big5, it is *not* true for GB18030.  This is *only* an encoding and
*not* a charset!  It is simply another representation of Unicode,
comparable to UTF-8 or UCS4.  There doesn't exist a single font
natively encoded in GB18030!  This encoding only exists to be
code-wise backward compatible with GB 2312.

To a certain extent it is valid to assume that a user of GB18030
expects Chinese glyph representation forms for characters in the CJK
range.  However, since full Unicode is supported, this assumption is
rather weak.

The X11 interface is too old actually to handle GB18030 correctly.
For example, on my GNU/Linux box xft offers the following:

  -adobe-noto sans cjk jp thin-light-r-normal--0-0-0-0-p-0-gb18030.2000-0

As the `jp' in the name indicates this font contains Japanese glyph
representation forms.  Since `Noto Sans CJK' provides all CJK glyphs
in the BMP, xft happily tags it with GB18030...

>> > In general, the way to request that Emacs uses fonts you like
>> > with certain characters or charsets is by customizing your
>> > fontsets.  I cannot say more without hearing the details.
>>
>> I don't have any fontsets customized in my `.emacs' file.
>
> Well, it sounds like you should.  Emacs chooses fonts using
> techniques that prefer speed to accuracy, and if that gives
> suboptimal results, the way to improve them is to guide Emacs by
> tailoring your fontset to the fonts you have installed and to the
> visual appearance you happen to like.

For the purpose of reporting this bug I thought it would be best to
not use further deviations of `emacs -Q'...

>> Both.  If I open a new file Unicode encoded file, Emacs continues
>> to use GB18030.2000 as the charset registry/encoding for displaying
>> fallback characters, failing to convert Unicode to GB18030 before
>> accessing the characters from the font backend.
>
> The former part is not a bug at all.

I agree.  I only wanted to tell you what I observe.


    Werner

  reply	other threads:[~2018-05-01 19:30 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-30  7:21 bug#31315: wrong font encoding for fallback font Werner LEMBERG
2018-04-30 15:13 ` Eli Zaretskii
2018-04-30 15:42   ` Andreas Schwab
2018-04-30 19:26     ` Eli Zaretskii
2018-04-30 20:03       ` Andreas Schwab
2018-05-01  2:37         ` Eli Zaretskii
2018-05-01  6:47         ` Werner LEMBERG
2018-05-01  8:13           ` Andreas Schwab
2018-05-01  9:11             ` Werner LEMBERG
2018-05-01 15:00               ` Eli Zaretskii
2018-05-01 17:42                 ` Andreas Schwab
2018-05-05  8:57                   ` Eli Zaretskii
2018-05-01  6:36   ` Werner LEMBERG
2018-05-01 15:22     ` Eli Zaretskii
2018-05-01 19:30       ` Werner LEMBERG [this message]
2018-05-02  7:27         ` Werner LEMBERG
2018-05-02 15:22         ` Eli Zaretskii
2018-05-03  5:52           ` Werner LEMBERG
2018-05-03 17:48             ` Eli Zaretskii
2018-05-03 19:05               ` Werner LEMBERG
2018-05-03 19:59                 ` Eli Zaretskii
2018-05-04  5:11                   ` Werner LEMBERG
2018-05-04 13:05                     ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180501.213014.1436609899151985328.wl@gnu.org \
    --to=wl@gnu.org \
    --cc=31315@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).