unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Po Lu <luangruo@yahoo.com>
Cc: emacs-devel@gnu.org
Subject: Re: master bf0aeaa0d7a: Re-enable displaying `han' characters on Android
Date: Thu, 01 Aug 2024 12:49:01 +0300	[thread overview]
Message-ID: <86sevowp2q.fsf@gnu.org> (raw)
In-Reply-To: <8734noslmt.fsf@yahoo.com> (message from Po Lu on Thu, 01 Aug 2024 16:16:58 +0800)

> From: Po Lu <luangruo@yahoo.com>
> Cc: emacs-devel@gnu.org
> Date: Thu, 01 Aug 2024 16:16:58 +0800
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I don't understand what you are saying.  What is "my discovery"?  And
> > why no font will ever match the font spec for 'han'? I've definitely
> > seen fonts that support _all_ of the characters I added to
> > script-representative-chars, so why wouldn't they be found?
> 
> I wasn't implying that no font will ever support one or more of these
> characters, but that such fonts are sufficiently obscure that the
> chances of a CJK font's being located by this value of
> `script-representative-chars' is nil.

I very much doubt that.  I see a few fonts on my Windows 11 system
which support all of those.  I'd be very surprised to know that no
such fonts are available on a modern GNU/Linux system.

Can someone please check that?

> >>   #x1f210 #x20000 #x2a700 #x2b740 #x2b820 #x2ceb0 #x2f804
> >
> > I added them because when those sub-optimal fonts are selected, some
> > of these characters appear as "tofu", which is ridiculous on a system
> > that has fonts installed that cover all of them.
> 
> These characters may be han script, but they are not attested in CJK
> documents in practice, except in contrived scenarios such as conversions
> between incomplete character encodings.

You keep saying that, but those are just your assertions.  These
characters are there for a reason, and they should be supported as
well as we can.  If worse comes to worst, we could split 'han' into
two or more scripts, and have separate setup in our fontsets for them.
Then each one could use its own representative characters.

> > And "seldom" is in the eyes of the beholder, at least IME.  When one
> > has text with these characters, the absolute frequency of their
> > appearance is not very relevant; what _is_ relevant is the fact that
> > the character cannot be shown by Emacs.
> 
> In practice, the outcome of this principle is that no font is detected
> with which to display CJK documents featuring none of these characters,
> very much against the expectations of CJK users.

Not IME.

> > If there's no fonts installed that support those representative
> > characters, and Emacs is capable of finding less capable fonts that
> > support some of CJK (e.g., the BMP blocks), then why is that a
> > problem?
> 
> I thought I explained that Emacs is _not_ capable of doing so on
> Android.

Then please design and implement a suitable solution for Android.  It
is not right to punish other platforms for Android-specific issues.

> > The purpose of the change is to allow Emacs to find better fonts if
> > they are installed, instead of ignoring them.  How is that a Bad
> > Thing?
> 
> Because it renders the `han' script incapable of matching any fonts that
> are installed in practice.

Again, not IME.

> > I still don't understand why this breaks Android, btw.  If Emacs
> > employs the fallback font specs with :lang you show above, why don't
> > they work for Android?
> 
> The problem is that QClang is not available on Android, because fonts do
> not provide their design languages in one of the standard TrueType
> tables its font backend groks, which deficiency prompted the addition of
> the font spec in question in:

OK, then it means we need to work around this, but without hampering
other platforms.

> The fact of the matter is that:
> 
> (let ((script-representative-chars
>        '((han #x2e90 #x2f00 #x3010 #x3200 #x3300 #x3400
> 	      #x31c0 #x4e10 #x5B57 #xfe30 #xf900
> 	      #x1f210 #x20000 #x2a700 #x2b740 #x2b820 #x2ceb0 #x2f804))))
>   (clear-font-cache)
>   (find-font (font-spec :registry "iso10646-1" :script 'han
>                         :type 'xfthb))) ;; or another ftfont backend.
> 
> returns no font on an up-to-date Fedora Workstation installation with a
> wealth of multilingual fonts for CJK scripts, whereas:
> 
> (let ((script-representative-chars
>        '((han #x2e90 #x2f00 #x3010 #x3200 #x3300 #x3400
> 	      #x31c0 #x4e10 #x5B57 #xfe30 #xf900))))
>   (clear-font-cache)
>   (find-font (font-spec :registry "iso10646-1" :script 'han
>                         :type 'xfthb)))
> 
> returns:
> 
> #<font-entity xfthb ADBO Noto\ Sans\ CJK\ HK nil iso10646-1 medium normal normal 0 nil nil 0>
> 
> which is more than adequate for editing CJK text in my language and
> others.

Not on MS-Windows: here, both of the above return

  #<font-entity harfbuzz outline Malgun\ Gothic sans iso10646-1 bold normal normal 0 nil 0 nil>

which is a lie in the latter case, since those additional characters
are not supported by this font.  Given that this method is evidently
unreliable, I don't think we should consider this a proof of your
argument.



  reply	other threads:[~2024-08-01  9:49 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-31 15:45 master bf0aeaa0d7a: Re-enable displaying `han' characters on Android Eli Zaretskii
2024-08-01  0:07 ` Po Lu
2024-08-01  0:33   ` Po Lu
2024-08-01  5:52     ` Eli Zaretskii
2024-08-01  7:55       ` Po Lu
2024-08-01  8:52         ` Eli Zaretskii
2024-08-01  9:47           ` Po Lu
2024-08-01  9:56             ` Eli Zaretskii
2024-08-01 10:13               ` Po Lu
2024-08-01 10:19                 ` Eli Zaretskii
2024-08-01 21:17             ` Dmitry Gutov
2024-08-01  5:32   ` Eli Zaretskii
2024-08-01  8:16     ` Po Lu
2024-08-01  9:49       ` Eli Zaretskii [this message]
2024-08-01 10:30         ` Po Lu
2024-08-01 10:35           ` Eli Zaretskii
2024-08-02 10:52           ` Benjamin Riefenstahl
2024-08-02 12:29             ` Eli Zaretskii
2024-08-02 12:55               ` Benjamin Riefenstahl
2024-08-02 13:13                 ` Benjamin Riefenstahl
2024-08-03  7:12                   ` pipcet
2024-08-03  8:52                     ` Po Lu
2024-08-03  9:21                       ` pipcet
2024-08-03  9:33                         ` Po Lu
2024-08-03 13:13                           ` pipcet
2024-08-03 13:31                             ` Po Lu
2024-08-03 14:31                               ` pipcet
2024-08-03 14:54                                 ` Po Lu
2024-08-07 17:52                                   ` Pip Cet
2024-08-08  0:10                                     ` Po Lu
2024-08-09 12:33                                       ` Pip Cet
2024-08-09 13:10                                         ` Po Lu
2024-08-03 15:15                     ` Eli Zaretskii
2024-08-02 10:44       ` Benjamin Riefenstahl
2024-08-02 11:42         ` Po Lu
2024-08-01  7:57   ` Andrea Corallo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86sevowp2q.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=luangruo@yahoo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).