From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Lee Sau Dan Newsgroups: gmane.emacs.help Subject: Re: Chinese characters support Date: 14 May 2003 08:14:14 +0200 Organization: Rechenzentrum der Universitaet Freiburg, Germany Sender: help-gnu-emacs-bounces+gnu-help-gnu-emacs=m.gmane.org@gnu.org Message-ID: References: <84el36a9ly.fsf@lucy.is.informatik.uni-duisburg.de> <84of287xce.fsf@lucy.is.informatik.uni-duisburg.de> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=cn-big5 Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1052894656 18726 80.91.224.249 (14 May 2003 06:44:16 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Wed, 14 May 2003 06:44:16 +0000 (UTC) Original-X-From: help-gnu-emacs-bounces+gnu-help-gnu-emacs=m.gmane.org@gnu.org Wed May 14 08:44:15 2003 Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19Fpzr-0004rp-00 for ; Wed, 14 May 2003 08:44:15 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 19Fq1N-0005ZW-06 for gnu-help-gnu-emacs@m.gmane.org; Wed, 14 May 2003 02:45:49 -0400 Original-Path: shelby.stanford.edu!newsfeed.stanford.edu!logbridge.uoregon.edu!feed2.news.rcn.net!rcn!feed.news.nacamar.de!news.belwue.de!news.uni-freiburg.de!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 74 Original-NNTP-Posting-Host: savona.informatik.uni-freiburg.de User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7 Original-Xref: shelby.stanford.edu gnu.emacs.help:113213 Original-To: help-gnu-emacs@gnu.org X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: help-gnu-emacs-bounces+gnu-help-gnu-emacs=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.help:9709 X-Report-Spam: http://spam.gmane.org/gmane.emacs.help:9709 >>>>> "Charles" == Charles Muller writes: Charles> One more time: Charles> Since the HELLO file is used for internal testing by Charles> Emacs coders it almost always works correctly in any Charles> recent Emacs "out of the box." No. If you have problems with the font installation (esp. when none of your font servers offer the relevant fonts or your sys. admin. simply don't care about your non-English needs), HELLO won't display the glyphs. It only display boxes there. Charles> The common misunderstanding occurs when people who are Charles> trying to get CJK working in utf-8 write to this, or Charles> another list for help, and list members, in the spirit of Charles> trying to be helpful, suggest that all is fine if the Charles> HELLO file displays right. For utf-8 testing, I'd refer someone to the test files in the MuleUCS package. Charles> Since the people who usually make the suggestion to test Charles> via the HELLO are those who do not regularly use CJK, it Charles> seems that they are not aware of this discrepancy, and I Charles> wanted to point this out. No. Those people often use CJK regularly. They just don't use utf-8. Like me (using Big5), they use a national encoding (e.g. GB2312, JIS, KSC). Charles> It seems strange to see people react so emotionally to Charles> the exposure of this simple point. No one is asking that Charles> the hallowed HELLO file be sent to oblivion--although a Charles> reincarnation as utf-8 would certainly not hurt! :-) That WILL certainly HURT. Look carefully at the section "Difference among chinese characters in GB, JIS, KSC, BIG5:" in HELLO. The same thing cannot be reproduced in vanilla utf-8, because Unicode unifies the various characters in these encoding into one single code point. (Most efforts in the earlier versions of Unicode were devoted to _unifying_ characters from different languages, employing different national encodings. The result is that you can no longer tell where a unified character is from Korean, Japanese and Chinese, who write them in slightly different ways.) If you want to test UTF-8 (Why not UTF-16? People who really use computers for Far East languages (CJK) would have to waste 50% disk space if they use UTF-8 to store their text files. UTF-16 is more space efficient.), do suggest including a UTF-8 test file. (Add a line in HELLO to instruct anyone how to open the UTF-8 test file, favourably with hot-key bindings.) And why stop there? Also have UTF-16 and UTF-7 test files. UTF-8 is simply NOT the magic panacea. It sucks when you have a file full of Chinese characters, for instance. The 3-byte per Chinese character "feature" of UTF-8 sucks. HELLO should remain a test file for the internal encoding "emacs-mule" and for displaying the true multilingual capabilities of Emacs. It has also been serving well to test font installation. It should never be recoded in utf-8, IMO. If all you care about is UTF-8, have another test file. Assuming that all CJK users should use UTF-8 is like assuming that everyone should fall faith to Vatican. -- Lee Sau Dan §õ¦u´°(Big5) ~{@nJX6X~}(HZ) E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee