From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: What exactly is chinese-big5? Date: Fri, 18 Apr 2008 11:16:39 +0300 Message-ID: References: Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: ger.gmane.org 1208564808 19350 80.91.229.12 (19 Apr 2008 00:26:48 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 19 Apr 2008 00:26:48 +0000 (UTC) Cc: emacs-devel@gnu.org To: Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Apr 19 02:23:59 2008 connect(): Connection refused Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1Jmlmj-0001nA-Qf for ged-emacs-devel@m.gmane.org; Fri, 18 Apr 2008 10:17:30 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Jmlm4-0006BL-OU for ged-emacs-devel@m.gmane.org; Fri, 18 Apr 2008 04:16:48 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Jmlm0-00069w-9c for emacs-devel@gnu.org; Fri, 18 Apr 2008 04:16:44 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Jmllz-00068e-4g for emacs-devel@gnu.org; Fri, 18 Apr 2008 04:16:43 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Jmlly-00067I-WC for emacs-devel@gnu.org; Fri, 18 Apr 2008 04:16:43 -0400 Original-Received: from mtaout3.012.net.il ([84.95.2.7]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1Jmllt-0008Ts-J0 for emacs-devel@gnu.org; Fri, 18 Apr 2008 04:16:39 -0400 Original-Received: from HOME-C4E4A596F7 ([83.130.246.94]) by i_mtaout3.012.net.il (HyperSendmail v2004.12) with ESMTPA id <0JZI00L8VIB601I0@i_mtaout3.012.net.il> for emacs-devel@gnu.org; Fri, 18 Apr 2008 11:30:43 +0300 (IDT) In-reply-to: X-012-Sender: halo1@inter.net.il X-detected-kernel: by monty-python.gnu.org: Solaris 9.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:95427 Archived-At: > From: Kenichi Handa > CC: emacs-devel@gnu.org > Date: Fri, 18 Apr 2008 10:32:15 +0900 > > Emacs supports full range of Big5 code space; i.e. > 1st byte: 0xA1 .. 0xFE > 2nd byte: 0x40 .. 0x7E and 0xA1 .. 0xFE Thank you for the detailed explanations. > In Emacs 22, you can read the written file by utf-8 and > search for U+FFFD. Is U+FFFD the _only_ character that will be produced for any codepoint that is unassigned in the Big5 code space? That is, if I search for U+FFFD, will I find _all_ the places where the original file had something not belonging to Big5? Also, assuming that I find one or more invalid characters, is there some encoding other than chinese-big5 that I should try, which could explain those problematic characters, besides those I mentioned in my original message? This file came from Chinese speaking people, so there's little doubt it should include only strings that can be read by Chinese speakers. Therefore, I wonder how come it does not translate cleanly into Unicode. (I cannot ask the people who produced the file about these issues, since they seem to be pretty ignorant about that: they claimed the file was in UTF-8...) Thanks again for you help.