* Font back end font selection process @ 2009-06-07 3:54 Adrian Robert 2009-06-07 5:59 ` Stephen J. Turnbull 2009-06-08 2:49 ` Kenichi Handa 0 siblings, 2 replies; 5+ messages in thread From: Adrian Robert @ 2009-06-07 3:54 UTC (permalink / raw) To: Emacs-Devel devel I am working on updating the NS font driver to work with script and friends so that correct nonASCII fonts can be chosen using the default fontset skeleton mechanism. The back end seems to use these methods to request a font from the list() method: - registry in the font spec proper - :script property in "extra" properties - :lang property in "extra" - part of the :otf property bundle in "extra" I haven't found a way to respond to the first type of query using Cocoa APIs yet. The others that get requested, and the order, seems to depend on the language in question. In particular, for some languages like Thai, only OTF requests ever seem to get made. It seems like this class might be the scripts requiring compositional rendering, but why, since emacs used to be able to handle compositional rendering without making use of any OTF-specific properties provided by a font driver? Also, often I have noticed that when given a Chinese text file (encoded in UTF-8), the only request that comes through is :lang=ja. How should the font driver know to return a kanji font instead of hiragana / katakana?. Wouldn't it would be better to request :script=han, adding :lang=ja or :lang=zh only if emacs has some knowledge that the file IS actually in one of these languages? The file encoding might be one piece of information to take into account, but when it is UTF-8 it would need to run some kind of lexical analysis, or query the user. I also noticed that if no entities are returned from a list() request with a family and a script specified, it next makes a list() request with no family specified. Instead of this it would be good to request a match() with the family still specified, as this gives the driver the opportunity to find a font that "looks like" the family (e.g. presence of serifs, etc.), instead of just a random font covering the needed characters. Indeed, I have not noticed match() being called at all when searching for a font for a script -- instead the back end just goes with the ascii font (and rendering boxes) before ever making such a request. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Font back end font selection process 2009-06-07 3:54 Font back end font selection process Adrian Robert @ 2009-06-07 5:59 ` Stephen J. Turnbull 2009-06-08 2:49 ` Kenichi Handa 1 sibling, 0 replies; 5+ messages in thread From: Stephen J. Turnbull @ 2009-06-07 5:59 UTC (permalink / raw) To: Adrian Robert; +Cc: Emacs-Devel devel Adrian Robert writes: > Also, often I have noticed that when given a Chinese text file > (encoded in UTF-8), the only request that comes through is :lang=ja. That reflects the historical origin of Mule, I would guess. > How should the font driver know to return a kanji font instead of > hiragana / katakana? If kana are present, it's Japanese. If Hangul are present, it's Korean. If the accents outnumber the base characters, it's Vietnamese. Otherwise, it's Chinese. There are more precise criteria based on usage of simplified characters, but that would be good enough for a start. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Font back end font selection process 2009-06-07 3:54 Font back end font selection process Adrian Robert 2009-06-07 5:59 ` Stephen J. Turnbull @ 2009-06-08 2:49 ` Kenichi Handa 2009-06-10 7:27 ` Adrian Robert 1 sibling, 1 reply; 5+ messages in thread From: Kenichi Handa @ 2009-06-08 2:49 UTC (permalink / raw) To: Adrian Robert; +Cc: emacs-devel In article <8BA022EF-AACD-495A-ABBB-24B230475217@gmail.com>, Adrian Robert <adrian.b.robert@gmail.com> writes: > I am working on updating the NS font driver to work with script and > friends so that correct nonASCII fonts can be chosen using the > default fontset skeleton mechanism. The back end seems to use these > methods to request a font from the list() method: > - registry in the font spec proper > - :script property in "extra" properties > - :lang property in "extra" > - part of the :otf property bundle in "extra" > I haven't found a way to respond to the first type of query using > Cocoa APIs yet. In that case, you can simply reject any register other than "iso10646-1". > The others that get requested, and the order, seems > to depend on the language in question. In particular, for some > languages like Thai, only OTF requests ever seem to get made. It > seems like this class might be the scripts requiring compositional > rendering, but why, since emacs used to be able to handle > compositional rendering without making use of any OTF-specific > properties provided by a font driver? Emacs 23 still can use non-OTF Thai font if the registry is tis620 or iso8859-11. The default fontset has this entry for Thai. (thai ,(font-spec :registry "iso10646-1" :otf '(thai nil nil (mark))) (nil . "TIS620*") (nil . "ISO8859-11")) The reason why I added :otf for "iso10646-1" is that now we have many OTF Thai fonts usable with Xft font-backend (and perhaps with uniscribe backend). OTF Thai fonts provide better Thai rendering than the simple relative stacking method of Emacs 22. But, if OTF is not available on Cocoa, I'll change the entry for Thai to something like this: (thai ,(font-spec :registry "iso10646-1" :otf '(thai nil nil (mark))) ,(font-spec :registry "iso10646-1" :scritp 'thai) (nil . "TIS620*") (nil . "ISO8859-11")) Does it solve your problem? > Also, often I have noticed that when given a Chinese text file > (encoded in UTF-8), the only request that comes through is :lang=ja. ?? For han script, the default fontset has this entry: (han (nil . "GB2312.1980-0") (nil . "JISX0208*") (nil . "JISX0212*") (nil . "big5*") (nil . "KSC5601.1987*") (nil . "CNS11643.1992-1") (nil . "CNS11643.1992-2") (nil . "CNS11643.1992-3") (nil . "CNS11643.1992-4") (nil . "CNS11643.1992-5") (nil . "CNS11643.1992-6") (nil . "CNS11643.1992-7") (nil . "gbk-0") (nil . "gb18030") (nil . "JISX0213.2000-1") (nil . "JISX0213.2000-2") (nil . "JISX0213.2004-1") ,(font-spec :registry "iso10646-1" :lang 'ja) ,(font-spec :registry "iso10646-1" :lang 'zh)) So, not only `ja', emacs should try `zh' if `ja' is not available. Doesn't it happen on Cocoa? > How should the font driver know to return a kanji font instead of > hiragana / katakana?. A font driver can return any 'ja' iso10646-1 fonts for this request (even if the font support only kana): ,(font-spec :registry "iso10646-1" :lang 'ja) If the first font in the returned list doesn't support a specific han character, Emacs tries another font in the returned list. > Wouldn't it would be better to > request :script=han, adding :lang=ja or :lang=zh only if emacs has > some knowledge that the file IS actually in one of these languages? > The file encoding might be one piece of information to take into > account, but when it is UTF-8 it would need to run some kind of > lexical analysis, or query the user. If the buffer file is in UTF-8, Emacs currently does this. If the current lang. env. is "Japanese", try :lang=ja before :lang=zh. If the current lang. env. is "Chinese-XXX", try :lang=zh before :lang=ja. Otherwise, try by the order the default fontset is defined (thus :lang-ja first). I've thought that should work well in most cases. "Some kinf of lexical analysis" is surely very good but currently we don't have that facility. And, "query the user" is too annoying. I think it is better to provide a good user interface for specifing a font for each script (or range of characters). > I also noticed that if no entities are returned from a list() request > with a family and a script specified, it next makes a list() request > with no family specified. Instead of this it would be good to > request a match() with the family still specified, as this gives the > driver the opportunity to find a font that "looks like" the family > (e.g. presence of serifs, etc.), instead of just a random font > covering the needed characters. Indeed, I have not noticed match() > being called at all when searching for a font for a script -- instead > the back end just goes with the ascii font (and rendering boxes) > before ever making such a request. Ah, that sounds a good idea. Another way is to allow font drivers to list also fonts of similar families (sorted by the closeness of family) and modify font_sort_entities to preserver the order of lists of other properties than family are the same. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Font back end font selection process 2009-06-08 2:49 ` Kenichi Handa @ 2009-06-10 7:27 ` Adrian Robert 2009-06-10 11:04 ` Kenichi Handa 0 siblings, 1 reply; 5+ messages in thread From: Adrian Robert @ 2009-06-10 7:27 UTC (permalink / raw) To: Kenichi Handa; +Cc: emacs-devel On Jun 8, 2009, at 9:49 AM, Kenichi Handa wrote: > In article <8BA022EF-AACD-495A-ABBB-24B230475217@gmail.com>, Adrian > Robert <adrian.b.robert@gmail.com> writes: >> >> - registry in the font spec proper >> - :script property in "extra" properties >> - :lang property in "extra" >> - part of the :otf property bundle in "extra" > >> I haven't found a way to respond to the first type of query using >> Cocoa APIs yet. > > In that case, you can simply reject any register other than > "iso10646-1". OK, that's what we are doing. >> In particular, for some >> languages like Thai, only OTF requests ever seem to get made. It >> seems like this class might be the scripts requiring compositional >> rendering, but why, since emacs used to be able to handle >> compositional rendering without making use of any OTF-specific >> properties provided by a font driver? > > Emacs 23 still can use non-OTF Thai font if the registry is > tis620 or iso8859-11. The default fontset has this entry > for Thai. > > (thai ,(font-spec :registry "iso10646-1" :otf '(thai nil nil > (mark))) > (nil . "TIS620*") > (nil . "ISO8859-11")) > > The reason why I added :otf for "iso10646-1" is that now we > have many OTF Thai fonts usable with Xft font-backend (and > perhaps with uniscribe backend). OTF Thai fonts provide > better Thai rendering than the simple relative stacking > method of Emacs 22. But, if OTF is not available on Cocoa, > I'll change the entry for Thai to something like this: > > (thai ,(font-spec :registry "iso10646-1" :otf '(thai nil nil > (mark))) > ,(font-spec :registry "iso10646-1" :scritp 'thai) > (nil . "TIS620*") > (nil . "ISO8859-11")) > > Does it solve your problem? Currently I'm just responding to the 'thai' in :otf with a Thai font and it seems to work reasonably. None of the otf functions are implemented in the NS font driver and I'm unsure whether they can be, but emacs' text layout must fall back to stacking automatically. If it would be better to refuse the :otf list() request at this stage then adding the :script 'thai entry would be good. The same goes for other entries in the default fontset that use :otf in the same way. >> Also, often I have noticed that when given a Chinese text file >> (encoded in UTF-8), the only request that comes through is :lang=ja. > > ?? For han script, the default fontset has this entry: > > (han (nil . "GB2312.1980-0") > (nil . "JISX0208*") > (nil . "JISX0212*") > (nil . "big5*") > ... > ,(font-spec :registry "iso10646-1" :lang 'ja) > ,(font-spec :registry "iso10646-1" :lang 'zh)) Why not have (font-spec :registry "iso10646-1" :script 'han) before the lang entries? > So, not only `ja', emacs should try `zh' if `ja' is not > available. Doesn't it happen on Cocoa? As long as there are Japanese fonts on the system (always true on OS X), the first 'ja request will return fonts and the 'zh one will never get made. >> How should the font driver know to return a kanji font instead of >> hiragana / katakana?. > > A font driver can return any 'ja' iso10646-1 fonts for this > request (even if the font support only kana): > > ,(font-spec :registry "iso10646-1" :lang 'ja) > > If the first font in the returned list doesn't support a > specific han character, Emacs tries another font in the > returned list. Ah, OK so for purposes of list() the driver should treat :lang='ja as "kana | kanji" instead of "kana & kanji", and treat kanji itself as "kanji | hanzi". ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Font back end font selection process 2009-06-10 7:27 ` Adrian Robert @ 2009-06-10 11:04 ` Kenichi Handa 0 siblings, 0 replies; 5+ messages in thread From: Kenichi Handa @ 2009-06-10 11:04 UTC (permalink / raw) To: Adrian Robert; +Cc: emacs-devel In article <E11A11F8-6256-4C97-A8FC-C8CA036E5002@gmail.com>, Adrian Robert <adrian.b.robert@gmail.com> writes: > Currently I'm just responding to the 'thai' in :otf with a Thai font > and it seems to work reasonably. None of the otf functions are > implemented in the NS font driver and I'm unsure whether they can be, > but emacs' text layout must fall back to stacking automatically. If > it would be better to refuse the :otf list() request at this stage > then adding the :script 'thai entry would be good. The same goes for > other entries in the default fontset that use :otf in the same way. If NS backend doesn't support OTF, it is better that `list' method returns nil for that request. So, I'll add ,(font-spec :registry "iso10646-1" :script 'thai) for Thai. By the way, for lao, the default fontset already has this entry after the entry specifying :otf property. ,(font-spec :registry "iso10646-1" :script 'lao) But, for the other scripts that request OTF, it is impossible to implement a falling back method. Simple stacking doesn't work for them. >>> Also, often I have noticed that when given a Chinese text file >>> (encoded in UTF-8), the only request that comes through is :lang=ja. > > > > ?? For han script, the default fontset has this entry: > > > > (han (nil . "GB2312.1980-0") > > (nil . "JISX0208*") > > (nil . "JISX0212*") > > (nil . "big5*") > > ... > > ,(font-spec :registry "iso10646-1" :lang 'ja) > > ,(font-spec :registry "iso10646-1" :lang 'zh)) > Why not have > (font-spec :registry "iso10646-1" :script 'han) > before the lang entries? Just to reduce the number of font-specs to try. Here I assume that a font that supports han script supports ja and/or zh, and thus adding the entry of :script 'han is redundant. > > So, not only `ja', emacs should try `zh' if `ja' is not > > available. Doesn't it happen on Cocoa? >>> How should the font driver know to return a kanji font instead of >>> hiragana / katakana?. > > > > A font driver can return any 'ja' iso10646-1 fonts for this > > request (even if the font support only kana): > > > > ,(font-spec :registry "iso10646-1" :lang 'ja) > > > > If the first font in the returned list doesn't support a > > specific han character, Emacs tries another font in the > > returned list. > Ah, OK so for purposes of list() the driver should treat :lang='ja as > "kana | kanji" instead of "kana & kanji", and treat kanji itself as > "kanji | hanzi". Yes. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-06-10 11:04 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-06-07 3:54 Font back end font selection process Adrian Robert 2009-06-07 5:59 ` Stephen J. Turnbull 2009-06-08 2:49 ` Kenichi Handa 2009-06-10 7:27 ` Adrian Robert 2009-06-10 11:04 ` Kenichi Handa
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.