From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: master bf0aeaa0d7a: Re-enable displaying `han' characters on Android Date: Thu, 01 Aug 2024 12:49:01 +0300 Message-ID: <86sevowp2q.fsf@gnu.org> References: <86h6c5y39e.fsf@gnu.org> <87plqtf6m0.fsf@yahoo.com> <864j84yfjh.fsf@gnu.org> <8734noslmt.fsf@yahoo.com> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23548"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Po Lu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Aug 01 11:49:44 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sZSRM-00060E-5c for ged-emacs-devel@m.gmane-mx.org; Thu, 01 Aug 2024 11:49:44 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sZSQm-0005EB-5r; Thu, 01 Aug 2024 05:49:08 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sZSQk-0005Do-F5 for emacs-devel@gnu.org; Thu, 01 Aug 2024 05:49:06 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sZSQk-0003jk-6Q; Thu, 01 Aug 2024 05:49:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=XdyGbSU9U5MgP9chgv1jJjGYx/pqhVx/QwEIg+sEBKo=; b=DYcr1KuCUZeC gOz9vZuo0psponnDHKyQqXuhJBNYhP7tXBtK1Ff2vWoVP6059g+tk9SCTX/uLPOMFV3eWAMfz2Hun Y+dEhuiDmcoa/i5ofNocD2brHLT4y53A+2X2Ne0UxDUIBhcNTehVT4WTbfCZb85Nrr8bcI3pVms6Z ecJ1f+/LePdD1+k5K4AFZGbbm92SjE487EGSusz132HV2YYrGgIzDymvg6xj0aVg/+89cVY2Iq0/a YOZLsoKaTqlaQyP9MYHn9NV7UWHkMa6BZvXBDoB+Wi8n5wU3ArzRs8IlKhiI3+ymVN7ilyytWvbyf oozNzzoBMSGqgfU3hwOvTA==; In-Reply-To: <8734noslmt.fsf@yahoo.com> (message from Po Lu on Thu, 01 Aug 2024 16:16:58 +0800) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:322260 Archived-At: > From: Po Lu > Cc: emacs-devel@gnu.org > Date: Thu, 01 Aug 2024 16:16:58 +0800 > > Eli Zaretskii writes: > > > I don't understand what you are saying. What is "my discovery"? And > > why no font will ever match the font spec for 'han'? I've definitely > > seen fonts that support _all_ of the characters I added to > > script-representative-chars, so why wouldn't they be found? > > I wasn't implying that no font will ever support one or more of these > characters, but that such fonts are sufficiently obscure that the > chances of a CJK font's being located by this value of > `script-representative-chars' is nil. I very much doubt that. I see a few fonts on my Windows 11 system which support all of those. I'd be very surprised to know that no such fonts are available on a modern GNU/Linux system. Can someone please check that? > >> #x1f210 #x20000 #x2a700 #x2b740 #x2b820 #x2ceb0 #x2f804 > > > > I added them because when those sub-optimal fonts are selected, some > > of these characters appear as "tofu", which is ridiculous on a system > > that has fonts installed that cover all of them. > > These characters may be han script, but they are not attested in CJK > documents in practice, except in contrived scenarios such as conversions > between incomplete character encodings. You keep saying that, but those are just your assertions. These characters are there for a reason, and they should be supported as well as we can. If worse comes to worst, we could split 'han' into two or more scripts, and have separate setup in our fontsets for them. Then each one could use its own representative characters. > > And "seldom" is in the eyes of the beholder, at least IME. When one > > has text with these characters, the absolute frequency of their > > appearance is not very relevant; what _is_ relevant is the fact that > > the character cannot be shown by Emacs. > > In practice, the outcome of this principle is that no font is detected > with which to display CJK documents featuring none of these characters, > very much against the expectations of CJK users. Not IME. > > If there's no fonts installed that support those representative > > characters, and Emacs is capable of finding less capable fonts that > > support some of CJK (e.g., the BMP blocks), then why is that a > > problem? > > I thought I explained that Emacs is _not_ capable of doing so on > Android. Then please design and implement a suitable solution for Android. It is not right to punish other platforms for Android-specific issues. > > The purpose of the change is to allow Emacs to find better fonts if > > they are installed, instead of ignoring them. How is that a Bad > > Thing? > > Because it renders the `han' script incapable of matching any fonts that > are installed in practice. Again, not IME. > > I still don't understand why this breaks Android, btw. If Emacs > > employs the fallback font specs with :lang you show above, why don't > > they work for Android? > > The problem is that QClang is not available on Android, because fonts do > not provide their design languages in one of the standard TrueType > tables its font backend groks, which deficiency prompted the addition of > the font spec in question in: OK, then it means we need to work around this, but without hampering other platforms. > The fact of the matter is that: > > (let ((script-representative-chars > '((han #x2e90 #x2f00 #x3010 #x3200 #x3300 #x3400 > #x31c0 #x4e10 #x5B57 #xfe30 #xf900 > #x1f210 #x20000 #x2a700 #x2b740 #x2b820 #x2ceb0 #x2f804)))) > (clear-font-cache) > (find-font (font-spec :registry "iso10646-1" :script 'han > :type 'xfthb))) ;; or another ftfont backend. > > returns no font on an up-to-date Fedora Workstation installation with a > wealth of multilingual fonts for CJK scripts, whereas: > > (let ((script-representative-chars > '((han #x2e90 #x2f00 #x3010 #x3200 #x3300 #x3400 > #x31c0 #x4e10 #x5B57 #xfe30 #xf900)))) > (clear-font-cache) > (find-font (font-spec :registry "iso10646-1" :script 'han > :type 'xfthb))) > > returns: > > # > > which is more than adequate for editing CJK text in my language and > others. Not on MS-Windows: here, both of the above return # which is a lie in the latter case, since those additional characters are not supported by this font. Given that this method is evidently unreliable, I don't think we should consider this a proof of your argument.