From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: sand@blarg.net Newsgroups: gmane.emacs.devel Subject: [PATCH] Re: ftfont ISO10646-1 font bug found (was Re: 23.0.60; Heavy display problems with new font backend) Date: Tue, 6 May 2008 21:44:15 -0700 Message-ID: <18465.13215.537191.430206@priss.frightenedpiglet.com> References: <87hcdy6srx.fsf@baldur.tsdh.de> <4809F8AC.8000802@gnu.org> <87skxgbtlm.fsf@localhorst.mine.nu> <18445.25224.874018.531866@priss.frightenedpiglet.com> <874p9uw77k.fsf@localhorst.mine.nu> <18460.40869.171854.738262@priss.frightenedpiglet.com> <18462.48548.616317.968827@priss.frightenedpiglet.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1210135480 22737 80.91.229.12 (7 May 2008 04:44:40 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 7 May 2008 04:44:40 +0000 (UTC) Cc: emacs-devel@gnu.org To: sand@blarg.net Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 07 06:45:12 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JtbWc-0007Yc-Ne for ged-emacs-devel@m.gmane.org; Wed, 07 May 2008 06:45:07 +0200 Original-Received: from localhost ([127.0.0.1]:54078 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JtbVs-0005yc-QD for ged-emacs-devel@m.gmane.org; Wed, 07 May 2008 00:44:20 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JtbVn-0005yD-T7 for emacs-devel@gnu.org; Wed, 07 May 2008 00:44:15 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JtbVn-0005xc-4Y for emacs-devel@gnu.org; Wed, 07 May 2008 00:44:15 -0400 Original-Received: from [199.232.76.173] (port=46612 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JtbVm-0005xS-KA for emacs-devel@gnu.org; Wed, 07 May 2008 00:44:14 -0400 Original-Received: from v-static-143-234.avvanta.com ([206.124.143.234] helo=priss.frightenedpiglet.com) by monty-python.gnu.org with smtp (Exim 4.60) (envelope-from ) id 1JtbVl-0003cU-9h for emacs-devel@gnu.org; Wed, 07 May 2008 00:44:14 -0400 Original-Received: (qmail 14821 invoked by uid 1000); 7 May 2008 04:44:15 -0000 In-Reply-To: <18462.48548.616317.968827@priss.frightenedpiglet.com> X-Mailer: VM 8.0.9 under Emacs 23.0.60.1 (i486-pc-linux-gnu) X-URL: http://home.blarg.net/~sand X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:96653 Archived-At: sand@blarg.net writes: > I think I have found the cause for problem #1. In ftfont_list(), the > code gathers a list of candidate fonts that match the > foundry/family/... requirements: > > objset = FcObjectSetBuild (FC_FOUNDRY, FC_FAMILY, FC_WEIGHT, FC_SLANT, > FC_WIDTH, FC_PIXEL_SIZE, FC_SPACING, > FC_CHARSET, FC_FILE, > #ifdef FC_FONTFORMAT > FC_FONTFORMAT, > #endif /* FC_FONTFORMAT */ > NULL); > /* ... elided ... */ > fontset = FcFontList (NULL, pattern, objset); > > Note that this doesn't include any registry restriction. > > The code loops across the returned fontsets, calling > ftfont_pattern_entity() to generate font_entity structs. But at no > point does it attempt to filter the font list by compatible > registries. We get, for example: > > (gdb) frame > #0 ftfont_pattern_entity (p=0x89fce40, frame=148009620, registry=138791553) at /home/upham/src/emacs/Apollo/emacs-cvs/src/ftfont.c:116 > (gdb) p file > $151 = (FcChar8 *) 0x8a70688 > "/home/upham/.fonts/jmk/neep-alt-iso8859-1-06x11.pcf.gz" > (gdb) p registry > $152 = 138791553 > (gdb) xpr registry > Lisp_Symbol > $153 = (struct Lisp_Symbol *) 0x845ca80 > "iso10646-1" > > Emacs will think that "neep-alt-iso8859-1-06x11.pcf.gz" is a valid > font for displaying "iso10646-1", but it isn't, and we end up with > missing code points. > > This explains why removing the iso8859-1 fonts fixed the problem > (except for the mode line file name): the current code also points > iso8859-1 requesters to iso10646-1 fonts, and those always work. I > also think this explains why I don't see this consistently across > hosts: depending on how the font list is ordered (maybe due to inode > ordering on disk?), some hosts will get a correct iso10646-1 -> > iso10646-1 mapping first at display time, while others will get an > incorrect iso10646-1 -> iso8859-1 mapping. > > Another family that should have the same problem is misc-fixed, as it > also has both iso8859-1 and iso10646-1 registry fonts. There may be > other families that I'm not aware of. Here's a patch against today's CVS HEAD. The ftfont_spec_pattern() function generates an FcPattern object that can be used to list only fonts matching the spec. For the purposes of this discussion, there are two "interesting" ways of restricting patterns: via charset (FcCharSet), or via langset (FcLangSet). The former requires the font to have each of the codepoints listed in the FcCharSet. The latter requires the font to support all the languages in the FcLangSet. 1. If we pass a font spec with registry ISO-8859 to ftfont_spec_pattern(), then the code sets up an FcCharSet that has every ASCII codepoint (but not Latin-1, that's commented out for some reason). 2. If we pass a font spec with a non-ISO-8859, non-ISO-10646, non-Unicode-BMP registry, the function immediately returns an empty pattern. 3. ISO-10646 and Unicode-BMP registries are handled in a more complicated manner... If the ISO-10646 font spec has an associated :script parameter (or an OpenType spec that refers to a script), the code looks in 'script-representative-chars' for codepoints to put into a charset. If the font spec has an associated language, the code adds the language to the langset. However, an ISO-10646 font spec without a special script or language ends up with neither a charset nor a langset. The resulting pattern will match *any* characters and languages. In partcular, it will let an ISO-8859 font match the ISO-10646 spec. The fix below checks for a missing charset and missing langset. In that case, we create a charset with at least one ISO-10646 codepoint outside of ISO-8859. The charset should be as small as possible, since a font missing any of the charset's codepoints becomes completely invalid. I have chosen LEFT DOUBLE QUOTATION MARK, which is associated with English and which I believe is pervasive. With the new charset restriction, ISO-8859 fonts are no longer considered matches and the font mismatch problem goes away. (We could add codepoints 32 through 127 and 192 through 255 to the ISO-10646 charset, but it's unlikely that any font advertising itself as ISO-10646 will be missing those codepoints. If we do need those extra codepoints, we can copy the implementation from ftfont_build_basic_charsets().) Derek -- Derek Upham sand@blarg.net ------------------------------ cut here ------------------------------ Index: ftfont.c =================================================================== RCS file: /sources/emacs/emacs/src/ftfont.c,v retrieving revision 1.9 diff -u -u -r1.9 ftfont.c --- ftfont.c 3 Apr 2008 08:16:54 -0000 1.9 +++ ftfont.c 6 May 2008 21:08:44 -0000 @@ -38,6 +38,9 @@ #include "font.h" #include "ftfont.h" +/* Codepoint in ISO-10646 that most English fonts will have. */ +#define CODEPOINT_ISO10646_ENGLISH 0x201C /* LEFT DOUBLE QUOTATION MARK */ + /* Symbolic type of this font-driver. */ Lisp_Object Qfreetype; @@ -521,6 +524,20 @@ } } + /* Lack of charset and langset at this point indicates an requested + ISO-10646 registry with no special script or language + requirement. We need a charset with some codepoint outside of + the ISO-8859-* range that most "English" fonts will have. + Otherwise the resulting pattern will also match ISO-8859 fonts. */ + if (! charset && ! langset) + { + charset = FcCharSetCreate (); + if (! charset) + goto err; + if (! FcCharSetAddChar (charset, CODEPOINT_ISO10646_ENGLISH)) + goto err; + } + pattern = FcPatternCreate (); if (! pattern) goto err;