From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Dave Love <d.love@dl.ac.uk>
Newsgroups: gmane.emacs.devel
Subject: Re: Several serious problems
Date: 30 Aug 2002 00:19:14 +0100
Sender: emacs-devel-admin@gnu.org
Message-ID: <rzq4rddz219.fsf@albion.dl.ac.uk>
References: <200208190748.QAA14278@etlken.m17n.org>
	<200208241211.g7OCBW111768@wijiji.santafe.edu>
	<200208261317.WAA27761@etlken.m17n.org>
NNTP-Posting-Host: localhost.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Trace: main.gmane.org 1030663440 8534 127.0.0.1 (29 Aug 2002 23:24:00 GMT)
X-Complaints-To: usenet@main.gmane.org
NNTP-Posting-Date: Thu, 29 Aug 2002 23:24:00 +0000 (UTC)
Cc: rms@gnu.org,  monnier+gnu/emacs@rum.cs.yale.edu,  keichwa@gmx.net,
	  emacs-devel@gnu.org
Return-path: <emacs-devel-admin@gnu.org>
Original-Received: from quimby.gnus.org ([80.91.224.244])
	by main.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 17kYdn-0002DF-00
	for <emacs-devel@main.gmane.org>; Fri, 30 Aug 2002 01:23:55 +0200
Original-Received: from monty-python.gnu.org ([199.232.76.173])
	by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian))
	id 17kZAI-0002Im-00
	for <emacs-devel@quimby.gnus.org>; Fri, 30 Aug 2002 01:57:30 +0200
Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org)
	by monty-python.gnu.org with esmtp (Exim 4.10)
	id 17kYfB-0003Dq-00; Thu, 29 Aug 2002 19:25:21 -0400
Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10)
	id 17kYZQ-0002kO-00
	for emacs-devel@gnu.org; Thu, 29 Aug 2002 19:19:24 -0400
Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10)
	id 17kYZN-0002k5-00
	for emacs-devel@gnu.org; Thu, 29 Aug 2002 19:19:23 -0400
Original-Received: from albion.dl.ac.uk ([148.79.80.39])
	by monty-python.gnu.org with esmtp (Exim 4.10)
	id 17kYZK-0002jc-00; Thu, 29 Aug 2002 19:19:18 -0400
Original-Received: from fx by albion.dl.ac.uk with local (Exim 3.35 #1 (Debian))
	id 17kYZG-0001jq-00; Fri, 30 Aug 2002 00:19:14 +0100
Original-To: Kenichi Handa <handa@etl.go.jp>
Original-Lines: 86
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
Errors-To: emacs-devel-admin@gnu.org
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.0.11
Precedence: bulk
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Post: <mailto:emacs-devel@gnu.org>
List-Subscribe: <http://mail.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
List-Id: Emacs development discussions. <emacs-devel.gnu.org>
List-Unsubscribe: <http://mail.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://mail.gnu.org/pipermail/emacs-devel/>
Xref: main.gmane.org gmane.emacs.devel:7138
X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:7138

Kenichi Handa <handa@etl.go.jp> writes:

> I don't know if they are the same as what Dave currently
> has.

I tried to install all the relevant stuff I had, but for the CVS head,
it's modified versions of what I've actually been using, and is
basically untested.  I wanted someone who was actually using that code
base to install it and test it, but no-one could or would -- I can't
remember, but rms leant on me to install it.

> But, I have not checked if they surely works as
> expected.  I believe Dave has done it.

Only in more-or-less Emacs 21.2.

> And, I don't understand why those many functions/variables
> are designed as the current way.  For instance,
>=20
> (1) Why does loadup.el has this code:
> 	(ucs-unify-8859 'encode-only)
> instead of:
> 	(unify-8859-on-encoding-mode 1)

Indeed.  I didn't do that.  The obvious thing to do is to change the
default in the defcustom, if ucs-tables is preloaded.

> (2) Why doesn't utf-8-subst.el provide mappings of
>     non-Chinese characters for ksc, gb, and jisx charsets?
>     The document of utf-8-translate-cjk says as below:
> ----------------------------------------------------------------------
> Whether the `mule-utf-8' coding system should encode many CJK characters.
>=20
> Enabling this loads tables which enable the coding system to encode
> characters in the charsets `korean-ksc5601', `chinese-gb2312' and
> `japanese-jisx0208', and to decode the corresponding unicodes into
> ...
> ----------------------------------------------------------------------
> but, currently only Chinese characters in those charsets are
> handled.

I didn't realize that.  It may be coincidence.  What should be
translated is the set of characters

(japanese-jisx0208 =E2=88=AA chinese-gb2312 =E2=88=AA korean-ksc5601) \ mul=
e-unicode-2500-33ff
                   ^                                  ^
                   union                              set difference

according to the Mule-UCS tables -- I just took the relevant codes
from there above U+33FF.  Perhaps that isn't how it actually is.

It needs someone with an interest in the CJK range to redo that stuff
anyhow; it shouldn't hardwire Japanese as the japanese-jisx0208 as the
preferred set, the sets used should probably be configurable, and it
should allow translating the relevant characters below U+3400.  (I
didn't think much about how best to do that without keeping large
tables on the heap that aren't actually used to do the translation.)

> (3) Why is utf-8-translate-cjk a variable, not a minor-mode
>     like unify-8859-on-(de/en)coding-mode?

I think because it can't be turned off.

>     Or, why the
>     latter is not a simple variable?   By the way, it seems
>     that once we customize utf-8-translate-cjk to t,
>     customize it back to nil doesn't cancel the translation.
>=20
> (4) It seems that the variable name
>     utf-8-fragment-on-decoding is not appropriate because it
>     is used also in utf-18.el.  Perhaps,
>     ucs-fragment-on-decoding is better.

Probably.  It was defined before I wrote utf-16.el.  Much of that
stuff would have been written differently for installation in 21.1,
but it was done during the campaign against anything Unicode-based, so
that users could have it in Emacs 21.2 as conveniently as possible.

> (5) It seems that mule-utf-16 can handle the same range of
>     characters as mule-utf-8, but `safe-charsets' property
>     doesn't contain, for instance, `latin-iso8895-2'.
>     Perhaps, this is simply a bug to be fixed easily.

Yes.  The coding system needs to register the relevant translation
table(s) for safe-chars, that would have to be updated in sync with
any changes.  I don't know why that didn't get done.