From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ilya Zakharevich Newsgroups: gmane.emacs.bugs Subject: bug#19993: 25.0.50; Unicode fonts defective on Windows Date: Thu, 12 Mar 2015 18:52:15 -0700 Message-ID: <20150313015215.GA32272@math.berkeley.edu> References: <20150306221351.GB16266@math.berkeley.edu> <83k2ytmd9q.fsf@gnu.org> <20150308083805.GA1763@math.berkeley.edu> <20150308084607.GA2135@math.berkeley.edu> <20150310162945.GA30876@math.berkeley.edu> <83a8zk6avh.fsf@gnu.org> <838uf4697w.fsf@gnu.org> <20150311194939.GA10710@math.berkeley.edu> <83mw3j475o.fsf@gnu.org> <83fv9a3wu0.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1426211602 22582 80.91.229.3 (13 Mar 2015 01:53:22 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 13 Mar 2015 01:53:22 +0000 (UTC) Cc: 19993@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Mar 13 02:53:12 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YWEmk-0006do-UK for geb-bug-gnu-emacs@m.gmane.org; Fri, 13 Mar 2015 02:53:11 +0100 Original-Received: from localhost ([::1]:34762 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YWEmk-0000Uo-4A for geb-bug-gnu-emacs@m.gmane.org; Thu, 12 Mar 2015 21:53:10 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49537) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YWEmg-0000UX-Qy for bug-gnu-emacs@gnu.org; Thu, 12 Mar 2015 21:53:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YWEmc-0002by-NM for bug-gnu-emacs@gnu.org; Thu, 12 Mar 2015 21:53:06 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:45870) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YWEmc-0002bk-Ih for bug-gnu-emacs@gnu.org; Thu, 12 Mar 2015 21:53:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1YWEmc-00014T-3R for bug-gnu-emacs@gnu.org; Thu, 12 Mar 2015 21:53:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Ilya Zakharevich Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 13 Mar 2015 01:53:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19993 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 19993-submit@debbugs.gnu.org id=B19993.14262115484040 (code B ref 19993); Fri, 13 Mar 2015 01:53:02 +0000 Original-Received: (at 19993) by debbugs.gnu.org; 13 Mar 2015 01:52:28 +0000 Original-Received: from localhost ([127.0.0.1]:44438 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YWEm3-000134-R1 for submit@debbugs.gnu.org; Thu, 12 Mar 2015 21:52:28 -0400 Original-Received: from nm30-vm6.bullet.mail.gq1.yahoo.com ([98.136.216.197]:45106) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YWEm1-00012e-DM for 19993@debbugs.gnu.org; Thu, 12 Mar 2015 21:52:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1426211538; bh=wykLEg1QAtxPUuiBSfnX3G2mSlKBo0JSWrF7D4tK9Ok=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From:Subject; b=ldeI5KkwLJm2UX5tK6vgzn7OFZv0869W79HSTQa80gDEZsDkjLrnjnxoCQTePWGDQU2fD9mf8eX4xBT8SOVKHA3P46bvY8AQxs3LWcuSQbjFD2MaP/xrrktxMD2N1G+/8JcA790h4ZAG7vyorfmsFRvPSBUUsZp4fDHRkRRDps8TAjNgPlCIJUQh2JFAnr5ERk2TH0hmwxYhZmyZwBJazLhPR+0tk5tWCdHaQ/cUbQOv9WA7wJZUcLd8ensRlNr7kDI4zYQnQKYme8nzavynVw9aTDjnQl6CNky84ci8Wv+TKKjP0szFdn2UPqV7wOaEkeYMiIZsV7CYSsJ1M+6YsA== Original-Received: from [98.137.12.174] by nm30.bullet.mail.gq1.yahoo.com with NNFMP; 13 Mar 2015 01:52:18 -0000 Original-Received: from [208.71.42.193] by tm13.bullet.mail.gq1.yahoo.com with NNFMP; 13 Mar 2015 01:52:18 -0000 Original-Received: from [127.0.0.1] by smtp204.mail.gq1.yahoo.com with NNFMP; 13 Mar 2015 01:52:18 -0000 X-Yahoo-Newman-Id: 960730.52191.bm@smtp204.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: MYjn00cVM1lXeO1toCsuSwBOYVkMqVlQiT_wEEHzNkukwXd bJsHAvBycgTSy.2Obex_2bUjWq740Hs8TsWYmIaLUQUse5r1bkadhFGUxnbz QX2PclZj8sKoXG2gUlT7YU9vm.WHudcb2p1jrXu8TrXXagPckaxA6ddtEpyD nH90xGKEZv1.sEGDQvafwzwGEgXmCMiQSMQBuvPS76u9eQiVE2GSeSR3d.bz XmbOs1O_ZhjD9EZjCxfezNxh8JgpSyePf0uko3cf40iDwatxw3h8pC3Gjt.U oQXYabMvlBogy_skSaduW6cwB1ry9GfedMRyh5a3OlcS18uf61jTlL2P2CSs 1_2bGWtmSXJsuQBLhG3tigOKJhJuyY59rifWut9sLseAfsoIPPrJ6yCq5p1T LX7biG1kjRYMCLJxjHYIoZaEct7vszN01FBUrKR.msXEw8slWNwMHHh4e3M6 yIz4D0wDT1MmYw9RqLuvJtQmSFzrdMTB1W0Lpmn.C0qi.8m0hXkYSFRCvWr6 9zKNMJ5YnqXEdruLpZpjmKvKoSJOtFhMLIUQc3RSK.B8Dg2L4xE4Q16LQA81 ykSWFPjHYYFovdRC_7ta3XPnqZ_YVenH1OyP812SQH2MBTwWf9CRHRJYJJpB jhlMGCVgvLk_rCoSlPkiQviHVpBdW6avReCB7xxgwLeuDCfI- X-Yahoo-SMTP: oLSY3dWswBBqoBVzCkLl_RIsw6heKMxu8wpEbARv1SU- Content-Disposition: inline In-Reply-To: <83fv9a3wu0.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:100426 Archived-At: On Thu, Mar 12, 2015 at 08:16:39PM +0200, Eli Zaretskii wrote: > support these characters. This is now fixed in commit fc10058 on > master. You should now be able to type "C-x 8 RET 1d400 RET" and see > the character displayed. > > While at that, I also added the missing subranges that for some reason > unknown to me were commented out; for many of them, I could verify > that adding them makes the corresponding characters displayable by > default, where they previously weren't. (I couldn't verify that for > some of the scripts for which I have no fonts.) A few subranges were > left out, and I added comments explaining why. A lot of thanks! > With that out of our way, part of the problem is solved. Part, but > not all of it. Because it is true: Emacs searches the fonts installed > on the system mostly by requiring only that the font supports the > script to which the character belongs, but without opening the font > and checking whether the specific character we are about to display > has a glyph in the font. Here's the crucial piece of code (from > fontset.c): > > /* Find a font best-matching with the spec without checking > the support of the character C. That checking is costly, > and even without the checking, the found font supports C > in high possibility. */ So, this explains why U+2099, U+27e8, U+27e9 are not shown here (while supported by a lot of fonts). Thanks for investigating this! > Assuming that we want to become smarter about this, we could do one or > both of the following: > > . have a database of fonts which are _not_ to be used for certain > scripts, because it is known that their coverage is poor > > . have a more elaborate default fontset that favors specific fonts > for scripts which these fonts are known to support well > Did you look into the link I provided (about how Firefox does it)? http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#There_is_no_way_to_show_Unicode_contents_on_Windows As my experiments show (I did not try to read the source code) the logic of falling back goes this way: • If document’s fonts can show a char, stop; • If (user-configurable) fallback fonts for a Subset can show a char, stop; • If (user-configurable) universal fallback fonts can show a char, stop; • Otherwise, scan all fonts to find one supporting a char. (The third case is the “x-unicode” pseudo-subset mentioned in the link above.) Emacs: • Supports different fallbacks for different subsets; • But it supports only one fallback font per character. (Well, it allows specifying more than one font, but as you saw, only one of them will be actually used — at least in the case when the fonts would claim having chars in all the ranges — as most of “good universal fonts” would do.) The second one is a significant show-stopper, since it is very hard to boil down things to one font. Myself, I only use scripts with “simple shaping”, so all of my needs are covered by 4 fonts: DejaVu * Symbola Junicode Unifont Smooth (with Unifont Smooth last, since though I’m still working on un-uglifying Unifont, there is a limit to algorithmic beautification, and it is always going to be MUCH worse than all the alternatives — when alternatives exist). BTW, is font-family search caseless? Since last year, the family was changed from unifont to Unifont (in the unifondry’s TTF distribution). > One problem with both of these is that it's hard to recommend fonts > because many good fonts are non-free. For simple rendering (no shaping), there is a lot of possibilities. Ilya