From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Kenichi Handa <handa@m17n.org>
Newsgroups: gmane.emacs.devel
Subject: Re: eight-bit char handling in emacs-unicode
Date: Tue, 18 Nov 2003 16:33:15 +0900 (JST)
Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org
Message-ID: <200311180733.QAA13703@etlken.m17n.org>
References: <ilubrrha7oc.fsf@latte.josefsson.org>	<200311130153.KAA04615@etlken.m17n.org>	<ilur80c50uj.fsf@latte.josefsson.org>	<200311130610.PAA04983@etlken.m17n.org>	<iluekwcwyl8.fsf@latte.josefsson.org>	<200311130901.SAA05204@etlken.m17n.org>	<ilun0b08by1.fsf@latte.josefsson.org>	<200311140047.JAA06414@etlken.m17n.org>
	<jwvhe12emr3.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
NNTP-Posting-Host: deer.gmane.org
Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: sea.gmane.org 1069140974 8912 80.91.224.253 (18 Nov 2003 07:36:14 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Tue, 18 Nov 2003 07:36:14 +0000 (UTC)
Cc: emacs-devel@gnu.org, jas@extundo.com
Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Tue Nov 18 08:36:11 2003
Return-path: <emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org>
Original-Received: from quimby.gnus.org ([80.91.224.244])
	by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 1AM0PD-00057P-00
	for <emacs-devel@deer.gmane.org>; Tue, 18 Nov 2003 08:36:11 +0100
Original-Received: from monty-python.gnu.org ([199.232.76.173])
	by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian))
	id 1AM0PD-00045E-00
	for <emacs-devel@quimby.gnus.org>; Tue, 18 Nov 2003 08:36:11 +0100
Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org)
	by monty-python.gnu.org with esmtp (Exim 4.24)
	id 1AM1M1-0005Mf-Eq
	for emacs-devel@quimby.gnus.org; Tue, 18 Nov 2003 03:36:57 -0500
Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.24)
	id 1AM1KT-0004Xu-5v
	for emacs-devel@gnu.org; Tue, 18 Nov 2003 03:35:21 -0500
Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.24)
	id 1AM1Ju-0003w2-Fw
	for emacs-devel@gnu.org; Tue, 18 Nov 2003 03:35:17 -0500
Original-Received: from [192.47.44.130] (helo=tsukuba.m17n.org)
	by monty-python.gnu.org with esmtp (Exim 4.24) id 1AM1Jt-0003u4-6u
	for emacs-devel@gnu.org; Tue, 18 Nov 2003 03:34:45 -0500
Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2])
	by tsukuba.m17n.org (8.11.6p2/3.7W-20010518204228) with ESMTP id
	hAI7XGh10520; Tue, 18 Nov 2003 16:33:16 +0900 (JST)
	(envelope-from handa@m17n.org)
Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125])
	by fs.m17n.org (8.11.6/3.7W-20010823150639) with ESMTP id hAI7XFs17285; 
	Tue, 18 Nov 2003 16:33:15 +0900 (JST)
Original-Received: (from handa@localhost)
	by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id QAA13703;
	Tue, 18 Nov 2003 16:33:15 +0900 (JST)
Original-To: monnier@IRO.UMontreal.CA
In-reply-to: <jwvhe12emr3.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
	(message from Stefan Monnier on 17 Nov 2003 16:17:56 -0500)
User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2
	Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.2
Precedence: list
List-Id: Emacs development discussions.  <emacs-devel.gnu.org>
List-Unsubscribe: <http://mail.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://mail.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://mail.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org
Xref: main.gmane.org gmane.emacs.devel:17880
X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:17880

In article <jwvhe12emr3.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>, Stef=
an Monnier <monnier@IRO.UMontreal.CA> writes:
>>  The basic problem is that we don't distinguish a character
>>  (code) and a number.  So, we introduce a character object

> That's one way to look at the problem.
> Another is to say that the problem is instead that we do not distinguish
> between arrays of chars and arrays of bytes.

I agree that it's possible to grasp the problem in that way,
but I'm not sure which is the better way.  Could you explain
WHY yours is better?

[...]
> In Emacs-21 we worked around the problem by arranging for "the
> eight-bit-char that encodes to 192" to be represented by the integer 192,=
 so
> as to avoid having to choose.  But with unicode, the 128-255 zone cannot =
be
> dedicated to eight-bit-char since it's already used up for latin-1, so we
> have to face the problem more directly.

> The places where Emacs-21 still had to choose, we just used heursitics,
> so `concat' will sometimes return a unibyte string, and sometimes
> multibyte string.

> So I think your options 1-3 are better than 4.  BTW, your function
> `eight-bit-char' should be named `byte-to-char' instead.

> Which of 1 to 3 is the best is not clear, and maybe we can just live with
> `make-string-unibyte' and `make-string-multibyte'.

I think you mean string-make-unibyte/multibyte, but, for the
current problem, we can't use it because string-make-unibyte
may behave differently in different language environment.
Such a lang. env. that makes iso-8859-1 or Unicode the
highest priority for the character `=C0' is ok.

(string-make-unibyte (concat '(?a 192))) =3D "a\300"

But, if some lang. env. prefers such a charset for `=C0' that
encodes it not to 192 (e.g. Vietnamese VSCII), we fail.

> Note that 1-3 are not mutually exclusive so we can use
> them all.

Yes, but, at least, I really want to avoid "(3) Make a
series of new functions".

---
Ken'ichi HANDA
handa@m17n.org