From: Werner LEMBERG <wl@gnu.org>
To: eliz@gnu.org
Cc: 31315@debbugs.gnu.org
Subject: bug#31315: wrong font encoding for fallback font
Date: Thu, 03 May 2018 07:52:27 +0200 (CEST) [thread overview]
Message-ID: <20180503.075227.154521786744892904.wl@gnu.org> (raw)
In-Reply-To: <83wowmozed.fsf@gnu.org>
> If by "xft" you mean the part of the X libraries that supports the
> APIs used by xfont.c, then I think we are on the same page now.
OK.
>> While this is correct for other CJK encodings like GB, JIS, KSC, or
>> Big5, it is *not* true for GB18030. This is *only* an encoding and
>> *not* a charset! It is simply another representation of Unicode,
>> comparable to UTF-8 or UCS4. There doesn't exist a single font
>> natively encoded in GB18030! This encoding only exists to be
>> code-wise backward compatible with GB 2312.
>
> Maybe so, but GB18030 is a Chinese encoding, and as such it behaves
> in Emacs as all the other Chinese encodings.
I know, and I agree. BUT! xft doesn't do what Emacs expects. *Any*
font that covers the whole BMP (in particular, the whole CJK part of
it) gets a `GB18030' tag from xft. In other words, the `Chinese'
property isn't in the font from the very beginning.[*]
> Emacs employs that logic for every charset it has defined, including
> Latin-2, for example: if text was decoded from an encoding which
> supports a particular charset, Emacs puts the corresponding
> 'charset' text property on the decoded text, and the machinery which
> selects the appropriate font tries first to find a font which
> supports that charset. The idea is that users in a particular
> culture have certain distinct preferences wrt fonts, and that an
> encoding that supports a certain charset or culture provides a hint
> about those preferences. This idea is very central in how Emacs
> selects fonts.
Being the FreeType maintainer, and having co-developed Emacs's
internal buffer encoding scheme many, many years ago, I all know this.
I can only repeat that Emacs might tag a certain text with GB18030 so
that the user can deduce a Chinese origin. However, there is *no*
guarantee that the user gets a Chinese-flavoured font – at least not
from the xft interface.[**]
As a corollary, it is fully sufficient for xft to handle GB18030 equal
to Unicode (i.e., `iso10646').
Werner
[*] Actually, having Unicode fonts that provide CJK glyphs for the
whole BMP completely spoils Emacs's font selection scheme based on
charsets – as shown in one of my previous e-mails, xft provides
all common CJK encodings for such fonts because Unicode is a
superset of those encodings.
[**] If, say, the Pango font interface is used instead to access a
modern CJK OpenType font, Emacs might request `script=hani,
lang=ZHS' if it encounters GB18030 to resolve Unicode's Han
unification, ensuring simplified Chinese glyph representation
forms.
next prev parent reply other threads:[~2018-05-03 5:52 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-30 7:21 bug#31315: wrong font encoding for fallback font Werner LEMBERG
2018-04-30 15:13 ` Eli Zaretskii
2018-04-30 15:42 ` Andreas Schwab
2018-04-30 19:26 ` Eli Zaretskii
2018-04-30 20:03 ` Andreas Schwab
2018-05-01 2:37 ` Eli Zaretskii
2018-05-01 6:47 ` Werner LEMBERG
2018-05-01 8:13 ` Andreas Schwab
2018-05-01 9:11 ` Werner LEMBERG
2018-05-01 15:00 ` Eli Zaretskii
2018-05-01 17:42 ` Andreas Schwab
2018-05-05 8:57 ` Eli Zaretskii
2018-05-01 6:36 ` Werner LEMBERG
2018-05-01 15:22 ` Eli Zaretskii
2018-05-01 19:30 ` Werner LEMBERG
2018-05-02 7:27 ` Werner LEMBERG
2018-05-02 15:22 ` Eli Zaretskii
2018-05-03 5:52 ` Werner LEMBERG [this message]
2018-05-03 17:48 ` Eli Zaretskii
2018-05-03 19:05 ` Werner LEMBERG
2018-05-03 19:59 ` Eli Zaretskii
2018-05-04 5:11 ` Werner LEMBERG
2018-05-04 13:05 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180503.075227.154521786744892904.wl@gnu.org \
--to=wl@gnu.org \
--cc=31315@debbugs.gnu.org \
--cc=eliz@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).