From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Kenichi Handa <handa@m17n.org>
Newsgroups: gmane.emacs.devel
Subject: Re: [PATCH] Unicode Lisp reader escapes
Date: Wed, 10 May 2006 14:37:44 +0900
Message-ID: <E1FdhOK-0005JX-00@etlken>
References: <17491.34779.959316.484740@parhasard.net>	<E1FaobM-0005qh-00@etlken>	<ufyjsemrn.fsf@gnu.org>	<E1Fb7ai-0002Yb-00@etlken>	<uy7xjcx5s.fsf@gnu.org>	<87odyfnqcj.fsf-monnier+emacs@gnu.org>	<E1FbSj5-0003RO-00@etlken>	<uk691da3o.fsf@gnu.org>	<17498.27200.911709.330947@parhasard.net>	<e3f682$91m$1@sea.gmane.org>	<E1FcNiN-0004c1-An@fencepost.gnu.org>	<877j4z5had.fsf@gmx.de>	<E1FcbO2-0002U6-0r@fencepost.gnu.org>	<E1FciWf-0007hz-00@etlken>	<87irohfrx1.fsf@gmx.de>
	<E1FdEE5-0008RE-Tp@fencepost.gnu.org>	<E1FdKXL-0005lp-00@etlken>
	<E1FdfFt-0006ux-Pm@fencepost.gnu.org>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset=ISO-2022-JP
X-Trace: sea.gmane.org 1147239541 7507 80.91.229.2 (10 May 2006 05:39:01 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Wed, 10 May 2006 05:39:01 +0000 (UTC)
Cc: alkibiades@gmx.de, emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 10 07:38:57 2006
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by ciao.gmane.org with esmtp (Exim 4.43)
	id 1FdhPT-0008LJ-Iw
	for ged-emacs-devel@m.gmane.org; Wed, 10 May 2006 07:38:55 +0200
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1FdhPS-0004nn-Vc
	for ged-emacs-devel@m.gmane.org; Wed, 10 May 2006 01:38:54 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1FdhPI-0004na-Ee
	for emacs-devel@gnu.org; Wed, 10 May 2006 01:38:44 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1FdhPG-0004mj-Be
	for emacs-devel@gnu.org; Wed, 10 May 2006 01:38:43 -0400
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1FdhPG-0004mU-2o
	for emacs-devel@gnu.org; Wed, 10 May 2006 01:38:42 -0400
Original-Received: from [192.47.44.130] (helo=tsukuba.m17n.org)
	by monty-python.gnu.org with esmtps
	(TLS-1.0:DHE_RSA_AES_256_CBC_SHA:32) (Exim 4.52)
	id 1FdhQS-00033e-4W; Wed, 10 May 2006 01:39:56 -0400
Original-Received: from nfs.m17n.org (nfs.m17n.org [192.47.44.7])
	by tsukuba.m17n.org (8.13.4/8.13.4/Debian-3sarge1) with ESMTP id
	k4A5cYBL015743; Wed, 10 May 2006 14:38:34 +0900
Original-Received: from etlken (etlken.m17n.org [192.47.44.125])
	by nfs.m17n.org (8.13.4/8.13.4/Debian-3sarge1) with ESMTP id
	k4A5cXQ3013924; Wed, 10 May 2006 14:38:34 +0900
Original-Received: from handa by etlken with local (Exim 3.36 #1 (Debian))
	id 1FdhOK-0005JX-00; Wed, 10 May 2006 14:37:44 +0900
Original-To: rms@gnu.org
In-reply-to: <E1FdfFt-0006ux-Pm@fencepost.gnu.org> (message from Richard
	Stallman on Tue, 09 May 2006 23:20:53 -0400)
User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2
	Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:54173
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/54173>

In article <E1FdfFt-0006ux-Pm@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

>       In addition, the default value of
>     utf-translate-cjk-mode t, and to which CJK charsets Han
>     characters of Unicode are decoded depends on these:

>     (1) current-language-environment

> What effect does this have?  (Aside from the choice of coding system,
> that is.)

Some Han characters in Unicode can be decoded into several
CJK charsets (e.g. chinese-gb2312, chinese-big5-1,
japanese-jisx0208).  current-language-environment decides
which of them to use.

>     (4) the contents of the hash table ucs-unicode-to-mule-cjk
>     (a user can freely reflect one's preference on how to decode
>     Unicode character by modifying this hash table).

> Could you tell me some examples for how users are really expected
> to use this?

I don't know a concrete example, but I can imagine this.
U+9AD9 is a variant of U+9AD8, but japanese-jisx0208
contains only the latter.  Actually, non of legacy CJK
charset contains U+9AD9.  But, as it is just a variant of
U+9AD8, just for reading, one may want to decode it into
japanese-jisx0208.  In such a case, one can simply do this:

(puthash #x9AD9 ?高 ucs-unicode-to-mule-cjk)

> Overall:

> With so many different variables that might affect the reading of
> these characters, it is just too inconvenient for every file to
> specify them all.  So I think we need a new feature to make that easy
> to do.

> Here's one idea.

> Add a new "variable" `buffer-coding' which is analogous to `coding'.
> Whereas `coding' specifies the encoding in the file, `buffer-coding'
> specifies the in-buffer encoding to produce in the buffer.  Its value
> could be a list or plist, which would specify the values of all these
> many variables.

> What do you think?  If you think this is a good idea, could
> you try designing the details?

No, it's an incredibly hard and heavy task.  When you read
utf-8.el and ucs-tables.el, you'll soon realize that.  I
believe it's just a waste of time to work on such a thing.

We have already done lots of workarounds for workarounds for
workarounds for not using Unicode internally, but there's a
limit.  I believe no one is pleased by producing the same
*.elc in such a situation.

Please accept this problem as a bad feature (not a bug), and
write something in etc/PROBLEMS.  If not, please decide to
shift to emacs-unicode just now.  It's the right thing to
solve this problem.

---
Kenichi Handa
handa@m17n.org