From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Possible UTF-8 CJK Regressions in Terminal Emulators Date: Mon, 7 Jun 2004 21:27:36 +0900 (JST) Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: <200406071227.VAA06216@etlken.m17n.org> References: <1077557604.1632.26.camel@duende> <1077643915.12919.2.camel@duende> <1077682436.28482.9.camel@duende> <200403010815.RAA14365@etlken.m17n.org> <200404071230.VAA25159@etlken.m17n.org> <200404091128.UAA02120@etlken.m17n.org> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: sea.gmane.org 1086611335 28108 80.91.224.253 (7 Jun 2004 12:28:55 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 7 Jun 2004 12:28:55 +0000 (UTC) Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Mon Jun 07 14:28:34 2004 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1BXJEw-0000gS-00 for ; Mon, 07 Jun 2004 14:28:34 +0200 Original-Received: from lists.gnu.org ([199.232.76.165]) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1BXJEv-0003VL-00 for ; Mon, 07 Jun 2004 14:28:34 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1BXJFS-0003cY-T0 for emacs-devel@quimby.gnus.org; Mon, 07 Jun 2004 08:29:07 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1BXJF7-0003aw-9S for emacs-devel@gnu.org; Mon, 07 Jun 2004 08:28:46 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1BXJF4-0003ZZ-NC for emacs-devel@gnu.org; Mon, 07 Jun 2004 08:28:44 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1BXJF4-0003ZT-I2 for emacs-devel@gnu.org; Mon, 07 Jun 2004 08:28:42 -0400 Original-Received: from [192.47.44.130] (helo=tsukuba.m17n.org) by monty-python.gnu.org with esmtp (Exim 4.34) id 1BXJEA-0007DT-Bl; Mon, 07 Jun 2004 08:27:46 -0400 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2]) by tsukuba.m17n.org (8.11.6p2/8.11.6) with ESMTP id i57CRdQ01135; Mon, 7 Jun 2004 21:27:39 +0900 (JST) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) by fs.m17n.org (8.11.6p2/8.11.6) with ESMTP id i57CRbW21869; Mon, 7 Jun 2004 21:27:38 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id VAA06216; Mon, 7 Jun 2004 21:27:36 +0900 (JST) Original-To: d.love@dl.ac.uk, mariano@gnome.org, alexander.winston@comcast.net, emacs-devel@gnu.org, danilo@gnome.org, monnier@iro.umontreal.ca, miles@gnu.org In-reply-to: <200404091128.UAA02120@etlken.m17n.org> (message from Kenichi Handa on Fri, 9 Apr 2004 20:28:05 +0900 (JST)) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.4 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:24664 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:24664 While fixing a bug of utf-8-post-read-conversion (it may modify a text out of range), I remembered this discussion, and did some work. In article <200404091128.UAA02120@etlken.m17n.org>, Kenichi Handa writes: > In article , Dave Love > writes: >> Kenichi Handa writes: >>> Wait! If utf-translate-cjk-mode can encode all jis, >>> kcs, big5, and gb to utf-8, >> I don't think that's true (or I think it wasn't when I >> built the tables). Maybe that's not so (now). Also, the >> tables are customizable by design -- for instance, I >> anticipated people adding characters from CNS. > I've just checked all subst-*.el. They all contain full > maps, i.e. all defined characters can be encoded into > utf-8. Of course, a character not defined in each > standard (e.g. a character made by (make-char > japanese-jisx0208 37 126)) can't be encoded, but I think > the merit of ignoring such a character is higher than > correctly telling that they can't be encoded into utf-8. I think I succeeded in loading subst-*.el not at the time of customizing utf-translate-cjk-mode to t but only when it is found that loading them is necessary on decoding or encoding utf-8, or on running decode/encode-char. This means that we can make the default value of utf-translate-cjk-mode to t without loading subst-*.el at building time. I think it's a big improvement especially for CJK users, and is an improvement of an existing feature rather than a new feature. If people agree on making utf-translate-cjk-mode to t, I'll brush-up the current working code and install the changes. --- Ken'ichi HANDA handa@m17n.org