From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.devel Subject: Re: Choice of fonts displaying etc/HELLO Date: Thu, 07 Aug 2008 00:52:13 +0900 Message-ID: <873aliw382.fsf@uwakimon.sk.tsukuba.ac.jp> References: <48900ED2.2000703@gnu.org> <4890670C.9000009@gnu.org> <48906865.4000808@gnu.org> <48907856.6040308@gnu.org> <48930CE4.5080305@gnu.org> <874p5ywtz4.fsf@uwakimon.sk.tsukuba.ac.jp> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1218038002 12070 80.91.229.12 (6 Aug 2008 15:53:22 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 6 Aug 2008 15:53:22 +0000 (UTC) Cc: lekktu@gmail.com, eliz@gnu.org, jasonr@gnu.org, emacs-devel@gnu.org To: Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Aug 06 17:54:12 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KQlKt-0000qg-BZ for ged-emacs-devel@m.gmane.org; Wed, 06 Aug 2008 17:54:04 +0200 Original-Received: from localhost ([127.0.0.1]:45150 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KQlJw-0006Dh-Sj for ged-emacs-devel@m.gmane.org; Wed, 06 Aug 2008 11:53:04 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KQlJs-0006Ab-Nq for emacs-devel@gnu.org; Wed, 06 Aug 2008 11:53:00 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KQlJp-00063y-49 for emacs-devel@gnu.org; Wed, 06 Aug 2008 11:53:00 -0400 Original-Received: from [199.232.76.173] (port=53951 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KQlJp-00063m-0E for emacs-devel@gnu.org; Wed, 06 Aug 2008 11:52:57 -0400 Original-Received: from mtps02.sk.tsukuba.ac.jp ([130.158.97.224]:46820) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KQlJa-0003W2-1e; Wed, 06 Aug 2008 11:52:42 -0400 Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp [130.158.99.156]) by mtps02.sk.tsukuba.ac.jp (Postfix) with ESMTP id 3FA4A7FFA; Thu, 7 Aug 2008 00:52:29 +0900 (JST) Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000) id 163B71A25C3; Thu, 7 Aug 2008 00:52:13 +0900 (JST) In-Reply-To: X-Mailer: VM ?bug? under XEmacs 21.5.21 (x86_64-unknown-linux) X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:102126 Archived-At: Kenichi Handa writes: > In article <874p5ywtz4.fsf@uwakimon.sk.tsukuba.ac.jp>, "Stephen J. Turnbull" writes: > > Emacs could adapt fontconfig's "orthography" mechanism instead. > > But, for selecting fonts for symbols (the current case is > U+2200 [FOR ALL]), such a mechanism doesn't work. Of course. Here's how I think about it. Historically, East Asian coded character sets have tended to try to be UCSes, including everything that might be needed, since it was possible in a MBCS. Trying to implement single octet codes with code page switching made no sense. That approach is unintuitive to Westerners who are used to having separate "code pages" or fonts for specialty usage like mathematics, and it has its practical limits for East Asians, too, what with the addition of 11,000 pre-composed Hangul and the CNS with ~80,000 code points. Of course, now we have a true UCS (even though it has some problems) in Unicode, so we should use it. And now FOR ALL should not be considered a "Japanese" character even if it does have a code point in some JIS standard. I would say the same for GREEK SMALL LETTER ALPHA. It's also very annoying in practical use (in Emacs, anyway) that GREEK SMALL LETTER ALPHA (not to mention LATIN SMALL LETTER A) has multiple encodings in the "native" coded character set and several others. Of course this can be useful if you happen not to have a math font but do have a Greek font or Japanese font that contains alpha or for all. For those purposes, fontconfig's character set feature is exactly what you want. > In addition, currently, Emacs doesn't know in which langauge > a text is written. So, we can't use an appropriate ":lang" > property of fontconfig. Well, for most users almost all of the time we do. The LANG environment variable will tell us. This will make most users very happy at little cost in coding. Agreed, in multilingual use, we can't use fontconfig directly. My idea is that fontconfig has already constructed a database of language repertoires and operations which might help in doing analysis of a text to determine its language. Also, instead of using the UCS-like repertoires of East Asian scripts to determine character categories, I suggest the fontconfig repertoires are more appropriate and will lead to more attractive presentation for users who have appropriate fonts.