From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#19993: 25.0.50; Unicode fonts defective on Windows Date: Thu, 12 Mar 2015 20:16:39 +0200 Message-ID: <83fv9a3wu0.fsf@gnu.org> References: <20150306162136.GA14179@math.berkeley.edu> <83r3t1nax7.fsf@gnu.org> <83oao5n83y.fsf@gnu.org> <20150306221351.GB16266@math.berkeley.edu> <83k2ytmd9q.fsf@gnu.org> <20150308083805.GA1763@math.berkeley.edu> <20150308084607.GA2135@math.berkeley.edu> <20150310162945.GA30876@math.berkeley.edu> <83a8zk6avh.fsf@gnu.org> <838uf4697w.fsf@gnu.org> <20150311194939.GA10710@math.berkeley.edu> <83mw3j475o.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1426184253 19380 80.91.229.3 (12 Mar 2015 18:17:33 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 12 Mar 2015 18:17:33 +0000 (UTC) Cc: 19993@debbugs.gnu.org To: ilya@math.berkeley.edu Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Mar 12 19:17:22 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YW7fY-0006aN-JD for geb-bug-gnu-emacs@m.gmane.org; Thu, 12 Mar 2015 19:17:16 +0100 Original-Received: from localhost ([::1]:33414 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YW7fX-0004K6-8R for geb-bug-gnu-emacs@m.gmane.org; Thu, 12 Mar 2015 14:17:15 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49860) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YW7fP-0004Jx-SN for bug-gnu-emacs@gnu.org; Thu, 12 Mar 2015 14:17:12 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YW7fK-00079B-LY for bug-gnu-emacs@gnu.org; Thu, 12 Mar 2015 14:17:07 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:45646) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YW7fK-000791-I6 for bug-gnu-emacs@gnu.org; Thu, 12 Mar 2015 14:17:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1YW7fK-0008Ia-8h for bug-gnu-emacs@gnu.org; Thu, 12 Mar 2015 14:17:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 12 Mar 2015 18:17:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19993 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 19993-submit@debbugs.gnu.org id=B19993.142618421331881 (code B ref 19993); Thu, 12 Mar 2015 18:17:02 +0000 Original-Received: (at 19993) by debbugs.gnu.org; 12 Mar 2015 18:16:53 +0000 Original-Received: from localhost ([127.0.0.1]:44214 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YW7fB-0008I8-2t for submit@debbugs.gnu.org; Thu, 12 Mar 2015 14:16:53 -0400 Original-Received: from mtaout25.012.net.il ([80.179.55.181]:58187) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YW7f8-0008Hp-AL for 19993@debbugs.gnu.org; Thu, 12 Mar 2015 14:16:52 -0400 Original-Received: from conversion-daemon.mtaout25.012.net.il by mtaout25.012.net.il (HyperSendmail v2007.08) id <0NL400J0027D6V00@mtaout25.012.net.il> for 19993@debbugs.gnu.org; Thu, 12 Mar 2015 20:11:32 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout25.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NL400GBA2J8N540@mtaout25.012.net.il>; Thu, 12 Mar 2015 20:11:32 +0200 (IST) In-reply-to: <83mw3j475o.fsf@gnu.org> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:100413 Archived-At: > Date: Wed, 11 Mar 2015 22:21:23 +0200 > From: Eli Zaretskii > Cc: 19993@debbugs.gnu.org > > > Date: Wed, 11 Mar 2015 12:49:39 -0700 > > From: Ilya Zakharevich > > Cc: 19993@debbugs.gnu.org > > > > So the real question is not whether the presence of a Subset is used > > as filters, but: is > > > > the presence of the required character in the font > > > > used as a filter. > > When a font matches all the other constraints, then yes, it is > actually tested for whether it supports the specific character we need > to display. See font_has_char and its callers. I had a few minutes to spare, so I took a closer look at the code. The problem with the Mathematical Alphanumeric Symbols block is much more prosaic than you thought: fontset.el breaks this block into several distinct pseudo-scripts (don't know why, perhaps for compatibility with something on Unix), but no one has taught w32font.c to do the same for the corresponding Unicode subrange. So Emacs was asking for, say, 'mathematical-italic' "script", but w32font.c was comparing that with 'mathematical', and was rejecting the fonts that support these characters. This is now fixed in commit fc10058 on master. You should now be able to type "C-x 8 RET 1d400 RET" and see the character displayed. While at that, I also added the missing subranges that for some reason unknown to me were commented out; for many of them, I could verify that adding them makes the corresponding characters displayable by default, where they previously weren't. (I couldn't verify that for some of the scripts for which I have no fonts.) A few subranges were left out, and I added comments explaining why. With that out of our way, part of the problem is solved. Part, but not all of it. Because it is true: Emacs searches the fonts installed on the system mostly by requiring only that the font supports the script to which the character belongs, but without opening the font and checking whether the specific character we are about to display has a glyph in the font. Here's the crucial piece of code (from fontset.c): /* Find a font best-matching with the spec without checking the support of the character C. That checking is costly, and even without the checking, the found font supports C in high possibility. */ font_entity = font_find_for_lface (f, face->lface, FONT_DEF_SPEC (font_def), -1); That -1 as the last argument tells font_find_for_lface to not check support for the character. So yes, if a font claims support for a script, but actually supports very little of it, it is quite possible that Emacs will try to use it, and will then be unable to display the missing characters. I know about font search on Unix even less than I know for Windows, so I cannot tell if on Unix we are smarter about this. I see that ftfont.c uses fontconfig functions to verify that the representative character of the required script (set up on fontset.el) is part of the charset supported by a font, but I don't know if that looks into the font, and in any case we only have at most 1 representative character for all but a few scripts. So misses are still possible; or maybe I'm missing something. Assuming that we want to become smarter about this, we could do one or both of the following: . have a database of fonts which are _not_ to be used for certain scripts, because it is known that their coverage is poor . have a more elaborate default fontset that favors specific fonts for scripts which these fonts are known to support well One problem with both of these is that it's hard to recommend fonts because many good fonts are non-free. If it turns out that these problems are Windows-specific, the above can be done for Windows only (like w32-standard-fontset-spec).