From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Kenichi Handa <handa@etl.go.jp>
Newsgroups: gmane.emacs.devel
Subject: Re: Several serious problems
Date: Thu, 29 Aug 2002 22:25:25 +0900 (JST)
Sender: emacs-devel-admin@gnu.org
Message-ID: <200208291325.WAA03596@etlken.m17n.org>
References: <200208190748.QAA14278@etlken.m17n.org> <rzqlm6ybz38.fsf@albion.dl.ac.uk>
NNTP-Posting-Host: localhost.gmane.org
Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset=US-ASCII
X-Trace: main.gmane.org 1030628222 13886 127.0.0.1 (29 Aug 2002 13:37:02 GMT)
X-Complaints-To: usenet@main.gmane.org
NNTP-Posting-Date: Thu, 29 Aug 2002 13:37:02 +0000 (UTC)
Cc: monnier+gnu/emacs@rum.cs.yale.edu, keichwa@gmx.net, rms@gnu.org,
   emacs-devel@gnu.org
Return-path: <emacs-devel-admin@gnu.org>
Original-Received: from quimby.gnus.org ([80.91.224.244])
	by main.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 17kPTj-0003bY-00
	for <emacs-devel@main.gmane.org>; Thu, 29 Aug 2002 15:36:55 +0200
Original-Received: from monty-python.gnu.org ([199.232.76.173])
	by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian))
	id 17kQ02-00052c-00
	for <emacs-devel@quimby.gnus.org>; Thu, 29 Aug 2002 16:10:18 +0200
Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org)
	by monty-python.gnu.org with esmtp (Exim 4.10)
	id 17kPLO-0007l0-00; Thu, 29 Aug 2002 09:28:18 -0400
Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10)
	id 17kPIv-0007ia-00
	for emacs-devel@gnu.org; Thu, 29 Aug 2002 09:25:45 -0400
Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10)
	id 17kPIr-0007iK-00
	for emacs-devel@gnu.org; Thu, 29 Aug 2002 09:25:44 -0400
Original-Received: from tsukuba.m17n.org ([192.47.44.130])
	by monty-python.gnu.org with esmtp (Exim 4.10)
	id 17kPIl-0007i1-00; Thu, 29 Aug 2002 09:25:36 -0400
Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2])
	by tsukuba.m17n.org (8.11.6/3.7W-20010518204228) with ESMTP id g7TDPPl03413;
	Thu, 29 Aug 2002 22:25:25 +0900 (JST)
	(envelope-from handa@m17n.org)
Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125])
	by fs.m17n.org (8.11.3/3.7W-20010823150639) with ESMTP id g7TDPP919097;
	Thu, 29 Aug 2002 22:25:25 +0900 (JST)
Original-Received: (from handa@localhost)
	by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id WAA03596;
	Thu, 29 Aug 2002 22:25:25 +0900 (JST)
Original-To: d.love@dl.ac.uk
In-Reply-To: <rzqlm6ybz38.fsf@albion.dl.ac.uk> (message from Dave Love on 22
	Aug 2002 18:08:43 +0100)
User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.1.30 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)
Errors-To: emacs-devel-admin@gnu.org
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.0.11
Precedence: bulk
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Post: <mailto:emacs-devel@gnu.org>
List-Subscribe: <http://mail.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
List-Id: Emacs development discussions. <emacs-devel.gnu.org>
List-Unsubscribe: <http://mail.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://mail.gnu.org/pipermail/emacs-devel/>
Xref: main.gmane.org gmane.emacs.devel:7109
X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:7109

In article <rzqlm6ybz38.fsf@albion.dl.ac.uk>,
  Dave Love <d.love@dl.ac.uk> writes:
> As far as I know, what's installed in the trunk behaves correctly, but
> I'm not using that code

Why aren't you using that code?  Does it mean that you
changed some of them locally?

> and I don't know if I'd hear about real
> problems with it (as opposed to imagined problems).  It should all be
> things you have said are OK or I'm sure you will think are OK, but I
> may have overlooked something.  However, it could use work for CJK, in
> particular; there's a fixme in utf-8, and there could be additional
> interconversion tables for CJK charsets as well as a way of
> customizing the character preferences in utf-8-subst.el, and probably
> other things.

I noticed those `fixme's.   Yes, it is better to solve all
of them, but, for the moment, I want to concentrate on
fixing the problem of RC.

>>  I've thought that the current codes were
>>  the same one as what Dave had, but the above statement of
>>  Dave's tells that it's not.

> Well, now I check, utf-8.el in the RC branch seems to be as I left it,
> which is what rms (I think) told me to do.  As far as I can tell, its
> safe-charsets property is correct,

The safe-charsets property of utf-8 in RC is this:

ascii eight-bit-control eight-bit-graphic latin-iso8859-1
mule-unicode-0100-24ff mule-unicode-2500-33ff
mule-unicode-e000-ffff ethiopic tibetan thai-tis620
katakana-jisx0201 ipa chinese-sisheng lao
vietnamese-viscii-lower vietnamese-viscii-upper

It doesn't contain latin-iso8859-[23...].

> and I don't understand what the complaint is about.  When
> I couldn't check, I assumed someone had modified it
> incorrectly, but there's no sign of that in CVS.

The complaint is that the coding-system utf-8 can't encode
latin-2 characters in RC even if loadup.el has these lines.

(load "international/ucs-tables")
(ucs-unify-8859 'encode-only)

The reason is, as far as I see, the ccl program
`ccl-encode-mule-utf-8' doesn't have this line at the near
to head.

	   (translate-character ucs-mule-to-mule-unicode r0 r1))

So, even if we setup the translation table
`ucs-mule-to-mule-unicode' at loadup time, it is not used in
utf-8.

>>  Could someone tell me why are they different in HEAD and RC,
>>  and why are they different from what Dave have written?

> Most changes aren't in RC since I was only allowed to add (a version
> of) ucs-tables, not changing the default behaviour, so people could
> turn on (partial) character translation themselves.  It doesn't affect
> utf-8 or any other ccl coding systems because they don't use the
> translation table (although the useful extra coding systems in
> code-pages.el aren't included either, so I think only koi,
> alternativnyj and mac-roman are affected).

Hmmm, I think I realized the situation of RC.  It can unify
charsets between iso-8859-X, but utf-8 can't encode
iso-8859-X (intentionally), correct?

Richard, is it what you asked Dave to install for RC?

I think RC should also allow utf-8 to encode 8859-X
correctly like in HEAD.  I see no harm in it.

> I think I unilaterally added some other things (a utf-8 language
> environment and utf-16.el?) since they addressed somewhat misleading
> entries in PROBLEMS and the arguments against the Unicode support are
> either demonstrably wrong or spurious IMNSHO.

I don't oppose to that.  I found one problem with utf-16.
It seems that utf-16-le/be can handle 8859-X correctly
because of this line in ccl-encode-mule-utf-16-le/be,
      (translate-character ucs-mule-to-mule-unicode r0 r1)
but the safe-charsets property lists only these:
      ascii
      eight-bit-control
      latin-iso8859-1
      mule-unicode-0100-24ff
      mule-unicode-2500-33ff
      mule-unicode-e000-ffff
thus, they can't be regarded as a safe coding system for
them.

> I'm afraid I've had enough of all this,

Yah, you have done the excellent hack!  When I implemented
translation table stuffs, I didn't expect that it can be
used this thoroughly.

> and I doubt it's worth more effort anyhow.  Especially
> after all the FUD about them, the Mule additions probably
> won't get used much unless they're the default, even by
> i18n people, unfortunately.

I thought containing ucs-tables and etc in RC is at least
for making unify-on-encoding the default INCLUDING utf-8.

---
Ken'ichi HANDA
handa@etl.go.jp