From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kevin Rodgers Newsgroups: gmane.emacs.help Subject: Re: More unicode blocks? Date: Wed, 28 Sep 2005 12:22:35 -0600 Message-ID: <433ADF6B.1070109@yahoo.com> References: <43359AB0.1050300@msa.hinet.net> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1127932731 22780 80.91.229.2 (28 Sep 2005 18:38:51 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 28 Sep 2005 18:38:51 +0000 (UTC) Cc: Shaddy.Baddah@msa.hinet.net Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Sep 28 20:38:46 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1EKgna-0008Bz-0Q for geh-help-gnu-emacs@m.gmane.org; Wed, 28 Sep 2005 20:36:58 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKgnY-0006on-Si for geh-help-gnu-emacs@m.gmane.org; Wed, 28 Sep 2005 14:36:56 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1EKglS-0006GE-4I for help-gnu-emacs@gnu.org; Wed, 28 Sep 2005 14:34:46 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1EKglO-0006EP-M5 for help-gnu-emacs@gnu.org; Wed, 28 Sep 2005 14:34:43 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKglN-00069W-Hp for help-gnu-emacs@gnu.org; Wed, 28 Sep 2005 14:34:41 -0400 Original-Received: from [80.91.229.2] (helo=ciao.gmane.org) by monty-python.gnu.org with esmtp (TLS-1.0:RSA_AES_128_CBC_SHA:16) (Exim 4.34) id 1EKgeH-0004up-1N for help-gnu-emacs@gnu.org; Wed, 28 Sep 2005 14:27:21 -0400 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1EKgbf-0004dE-Ll for help-gnu-emacs@gnu.org; Wed, 28 Sep 2005 20:24:40 +0200 Original-Received: from 207.167.42.60 ([207.167.42.60]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 28 Sep 2005 20:24:39 +0200 Original-Received: from ihs_4664 by 207.167.42.60 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 28 Sep 2005 20:24:39 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-To: help-gnu-emacs@gnu.org Original-Lines: 60 Original-X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 207.167.42.60 User-Agent: Mozilla Thunderbird 0.9 (X11/20041105) X-Accept-Language: en-us, en In-Reply-To: <43359AB0.1050300@msa.hinet.net> X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:29796 Archived-At: Shaddy Baddah wrote: > Today, I finally did what I had resolved to do some time ago. I delved > into emacs's unicode support facilities. > > I am a little disappointed, because it has become apparent that the > unicode character set support is limited to 3 specific blocks of the > full unicode character set, those being the blocks that start and end at > the indexes expressed in mule-unicode-0100-24ff, mule-unicode-2500-33ff > and mule-unicode-e000-ffff. > > The blocks that I am interested in are the CJK Unified Ideographs blocks > , that start at unicode index 0x4E00. Specifically, the characters that > are shared by the character set encoded via the big5 encoding scheme. Perhaps you should try Emacs 22 (aka CVS Emacs). Here are some items from its etc/NEWS file: --- *** The utf-8/16 coding systems have been enhanced. By default, untranslatable utf-8 sequences are simply composed into single quasi-characters. User option `utf-translate-cjk-mode' (it is turned on by default) arranges to translate many utf-8 CJK character sequences into real Emacs characters in a similar way to the Mule-UCS system. As this loads a fairly big data on demand, people who are not interested in CJK characters may want to customize it to nil. You can augment/amend the CJK translation via hash tables `ucs-mule-cjk-to-unicode' and `ucs-unicode-to-mule-cjk'. The utf-8 coding system now also encodes characters from most of Emacs's one-dimensional internal charsets, specifically the ISO-8859 ones. The utf-16 coding system is affected similarly. --- *** A new coding system `euc-tw' has been added for traditional Chinese in CNS encoding; it accepts both Big 5 and CNS as input; on saving, Big 5 is then converted to CNS. --- *** New variable `utf-translate-cjk-unicode-range' controls which Unicode characters to translate in `utf-translate-cjk-mode'. --- *** iso-10646-1 (`Unicode') fonts can be used to display any range of characters encodable by the utf-8 coding system. Just specify the fontset appropriately. > I have no problems displaying and editing these characters under the > big5 coding scheme, so they are obviously well supported by emacs (and > it's internal coding scheme, right?). > > So, what is the impediment, or perhaps rationale, behind the lack of > support for the additional unicode blocks at this stage of Emacs > development? > > Is it simply to do with someone having to implement some type of > character translation tables, or is there/how much more is there to it? Sorry, I don't know the answers to those questions. -- Kevin Rodgers