From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Font back end font selection process Date: Mon, 08 Jun 2009 11:49:52 +0900 Message-ID: References: <8BA022EF-AACD-495A-ABBB-24B230475217@gmail.com> NNTP-Posting-Host: lo.gmane.org X-Trace: ger.gmane.org 1244429417 6702 80.91.229.12 (8 Jun 2009 02:50:17 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 8 Jun 2009 02:50:17 +0000 (UTC) Cc: emacs-devel@gnu.org To: Adrian Robert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jun 08 04:50:12 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MDUw8-0005Uy-Gm for ged-emacs-devel@m.gmane.org; Mon, 08 Jun 2009 04:50:12 +0200 Original-Received: from localhost ([127.0.0.1]:44253 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MDUw8-0001is-0S for ged-emacs-devel@m.gmane.org; Sun, 07 Jun 2009 22:50:12 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MDUw2-0001iB-WB for emacs-devel@gnu.org; Sun, 07 Jun 2009 22:50:07 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MDUvy-0001gQ-0z for emacs-devel@gnu.org; Sun, 07 Jun 2009 22:50:06 -0400 Original-Received: from [199.232.76.173] (port=50760 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MDUvx-0001gN-SC for emacs-devel@gnu.org; Sun, 07 Jun 2009 22:50:01 -0400 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:48471) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MDUvw-0006bU-Tp for emacs-devel@gnu.org; Sun, 07 Jun 2009 22:50:01 -0400 Original-Received: from rqsmtp2.aist.go.jp (rqsmtp2.aist.go.jp [150.29.254.123]) by mx1.aist.go.jp with ESMTP id n582nr9e019390; Mon, 8 Jun 2009 11:49:53 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp4.aist.go.jp by rqsmtp2.aist.go.jp with ESMTP id n582nr2Y022181; Mon, 8 Jun 2009 11:49:53 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp4.aist.go.jp with ESMTP id n582nq3T010018; Mon, 8 Jun 2009 11:49:52 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken with local (Exim 4.69) (envelope-from ) id 1MDUvo-00014b-QU; Mon, 08 Jun 2009 11:49:52 +0900 In-reply-to: <8BA022EF-AACD-495A-ABBB-24B230475217@gmail.com> (message from Adrian Robert on Sun, 7 Jun 2009 10:54:09 +0700) X-detected-operating-system: by monty-python.gnu.org: Solaris 9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:111371 Archived-At: In article <8BA022EF-AACD-495A-ABBB-24B230475217@gmail.com>, Adrian Robert writes: > I am working on updating the NS font driver to work with script and > friends so that correct nonASCII fonts can be chosen using the > default fontset skeleton mechanism. The back end seems to use these > methods to request a font from the list() method: > - registry in the font spec proper > - :script property in "extra" properties > - :lang property in "extra" > - part of the :otf property bundle in "extra" > I haven't found a way to respond to the first type of query using > Cocoa APIs yet. In that case, you can simply reject any register other than "iso10646-1". > The others that get requested, and the order, seems > to depend on the language in question. In particular, for some > languages like Thai, only OTF requests ever seem to get made. It > seems like this class might be the scripts requiring compositional > rendering, but why, since emacs used to be able to handle > compositional rendering without making use of any OTF-specific > properties provided by a font driver? Emacs 23 still can use non-OTF Thai font if the registry is tis620 or iso8859-11. The default fontset has this entry for Thai. (thai ,(font-spec :registry "iso10646-1" :otf '(thai nil nil (mark))) (nil . "TIS620*") (nil . "ISO8859-11")) The reason why I added :otf for "iso10646-1" is that now we have many OTF Thai fonts usable with Xft font-backend (and perhaps with uniscribe backend). OTF Thai fonts provide better Thai rendering than the simple relative stacking method of Emacs 22. But, if OTF is not available on Cocoa, I'll change the entry for Thai to something like this: (thai ,(font-spec :registry "iso10646-1" :otf '(thai nil nil (mark))) ,(font-spec :registry "iso10646-1" :scritp 'thai) (nil . "TIS620*") (nil . "ISO8859-11")) Does it solve your problem? > Also, often I have noticed that when given a Chinese text file > (encoded in UTF-8), the only request that comes through is :lang=ja. ?? For han script, the default fontset has this entry: (han (nil . "GB2312.1980-0") (nil . "JISX0208*") (nil . "JISX0212*") (nil . "big5*") (nil . "KSC5601.1987*") (nil . "CNS11643.1992-1") (nil . "CNS11643.1992-2") (nil . "CNS11643.1992-3") (nil . "CNS11643.1992-4") (nil . "CNS11643.1992-5") (nil . "CNS11643.1992-6") (nil . "CNS11643.1992-7") (nil . "gbk-0") (nil . "gb18030") (nil . "JISX0213.2000-1") (nil . "JISX0213.2000-2") (nil . "JISX0213.2004-1") ,(font-spec :registry "iso10646-1" :lang 'ja) ,(font-spec :registry "iso10646-1" :lang 'zh)) So, not only `ja', emacs should try `zh' if `ja' is not available. Doesn't it happen on Cocoa? > How should the font driver know to return a kanji font instead of > hiragana / katakana?. A font driver can return any 'ja' iso10646-1 fonts for this request (even if the font support only kana): ,(font-spec :registry "iso10646-1" :lang 'ja) If the first font in the returned list doesn't support a specific han character, Emacs tries another font in the returned list. > Wouldn't it would be better to > request :script=han, adding :lang=ja or :lang=zh only if emacs has > some knowledge that the file IS actually in one of these languages? > The file encoding might be one piece of information to take into > account, but when it is UTF-8 it would need to run some kind of > lexical analysis, or query the user. If the buffer file is in UTF-8, Emacs currently does this. If the current lang. env. is "Japanese", try :lang=ja before :lang=zh. If the current lang. env. is "Chinese-XXX", try :lang=zh before :lang=ja. Otherwise, try by the order the default fontset is defined (thus :lang-ja first). I've thought that should work well in most cases. "Some kinf of lexical analysis" is surely very good but currently we don't have that facility. And, "query the user" is too annoying. I think it is better to provide a good user interface for specifing a font for each script (or range of characters). > I also noticed that if no entities are returned from a list() request > with a family and a script specified, it next makes a list() request > with no family specified. Instead of this it would be good to > request a match() with the family still specified, as this gives the > driver the opportunity to find a font that "looks like" the family > (e.g. presence of serifs, etc.), instead of just a random font > covering the needed characters. Indeed, I have not noticed match() > being called at all when searching for a font for a script -- instead > the back end just goes with the ascii font (and rendering boxes) > before ever making such a request. Ah, that sounds a good idea. Another way is to allow font drivers to list also fonts of similar families (sorted by the closeness of family) and modify font_sort_entities to preserver the order of lists of other properties than family are the same. --- Kenichi Handa handa@m17n.org