From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Kenichi Handa <handa@m17n.org>
Newsgroups: gmane.emacs.devel
Subject: Re: eight-bit char handling in emacs-unicode
Date: Tue, 25 Nov 2003 10:07:18 +0900 (JST)
Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org
Message-ID: <200311250107.KAA24646@etlken.m17n.org>
References: <ilubrrha7oc.fsf@latte.josefsson.org>	<200311130153.KAA04615@etlken.m17n.org>	<ilur80c50uj.fsf@latte.josefsson.org>	<200311130610.PAA04983@etlken.m17n.org>	<iluekwcwyl8.fsf@latte.josefsson.org>	<200311130901.SAA05204@etlken.m17n.org>	<ilun0b08by1.fsf@latte.josefsson.org>	<200311140047.JAA06414@etlken.m17n.org>	<jwvhe12emr3.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>	<200311180733.QAA13703@etlken.m17n.org>	<jwvn0atd38w.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>	<200311190006.JAA14847@etlken.m17n.org>	<jwvptfp139w.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>	<200311210041.JAA18324@etlken.m17n.org>	<jwvzneqwbo3.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>	<200311210627.PAA18757@etlken.m17n.org>	<jwvvfpdsrab.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>	<200311220125.KAA20128@etlken.m17n.org>	<jwvoev4ufqd.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>	<200311230730.QAA21903@etlken.m17n.org>
	<jwvr7zybqvr.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
NNTP-Posting-Host: deer.gmane.org
Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset=US-ASCII
X-Trace: sea.gmane.org 1069722737 5763 80.91.224.253 (25 Nov 2003 01:12:17 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Tue, 25 Nov 2003 01:12:17 +0000 (UTC)
Cc: jas@extundo.com, emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Tue Nov 25 02:12:14 2003
Return-path: <emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org>
Original-Received: from quimby.gnus.org ([80.91.224.244])
	by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 1AORkU-0004sw-00
	for <emacs-devel@deer.gmane.org>; Tue, 25 Nov 2003 02:12:14 +0100
Original-Received: from monty-python.gnu.org ([199.232.76.173])
	by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian))
	id 1AORkU-0001nM-00
	for <emacs-devel@quimby.gnus.org>; Tue, 25 Nov 2003 02:12:14 +0100
Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org)
	by monty-python.gnu.org with esmtp (Exim 4.24)
	id 1AOSgQ-0007AX-GH
	for emacs-devel@quimby.gnus.org; Mon, 24 Nov 2003 21:12:06 -0500
Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.24)
	id 1AOSgE-00078R-NE
	for emacs-devel@gnu.org; Mon, 24 Nov 2003 21:11:54 -0500
Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.24)
	id 1AOSfi-0006y4-Jl
	for emacs-devel@gnu.org; Mon, 24 Nov 2003 21:11:53 -0500
Original-Received: from [192.47.44.130] (helo=tsukuba.m17n.org)
	by monty-python.gnu.org with esmtp (Exim 4.24) id 1AOSdK-0006NP-RW
	for emacs-devel@gnu.org; Mon, 24 Nov 2003 21:08:55 -0500
Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2])
	by tsukuba.m17n.org (8.11.6p2/3.7W-20010518204228) with ESMTP id
	hAP17Jh18892; Tue, 25 Nov 2003 10:07:19 +0900 (JST)
	(envelope-from handa@m17n.org)
Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125])
	by fs.m17n.org (8.11.6/3.7W-20010823150639) with ESMTP id hAP17Is11080; 
	Tue, 25 Nov 2003 10:07:18 +0900 (JST)
Original-Received: (from handa@localhost)
	by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id KAA24646;
	Tue, 25 Nov 2003 10:07:18 +0900 (JST)
Original-To: monnier@IRO.UMontreal.CA
In-reply-to: <jwvr7zybqvr.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
	(message from Stefan Monnier on 23 Nov 2003 18:48:08 -0500)
User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2
	Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.2
Precedence: list
List-Id: Emacs development discussions.  <emacs-devel.gnu.org>
List-Unsubscribe: <http://mail.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://mail.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://mail.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org
Xref: main.gmane.org gmane.emacs.devel:18096
X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:18096

In article <jwvr7zybqvr.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>, Stefan Monnier <monnier@IRO.UMontreal.CA> writes:
>>  But, the concept of unibyte<->multibyte convesion itself is
>>  not ad-hoc.  Don't you think their meaning is very clear
>>  when you grasp them as my way?  Do you see any inconsistency
>>  in my explanation about them?

> No, as a matter of fact I don't see why in a utf-8 environment,
> it makes any sense to have a function that turns a multibyte string
> into a unibyte string encoded in latin-1

It seems that you keep of saying that "A does B, thus it's
nonsense".  But, I'm arguing that "A does C".

It doesn't make sense because you treat the result as "a
unibyte string encoded in Latin-1".

It makes sense if you treat the result as "a unibyte string
in which each byte represents a sequence of Unicode
code-points", doesn't it?

> (without even complaining when it encounters other
> characters).

I think it's ok (or better) that string-make-unibyte
complains in such a case.   

> It'd make sense if the environment said "latin-1 when you can,
> utf-8 otherwise" or something like that, but then we would use
> encode-coding-string anyway.

It's itself nonsense to have such a coding system.  Do you
agree with having string-make-unibyte if it signals an error
on non-Latin-1 characters?

> Besides, if any non-latin-1 char is encountered by string-make-unibyte, then
> we end up with a uninyte string that has an unknown meaning because some
> chars might have been encoded in latin-1, and others in some other encoding.

> I just don't know of a concrete case where it makes sense to use
> string-make-unibyte.

I'll paraphrase my previous example as this:

  It is perfectly possible to live in such an environment
  where only the characters U+0000..U+00FF of Unicode is
  used but only the coding system utf-8 is used.

But, I don't claim that the above is a realistic case.

Another non-realistic but concrete case is:

  Use only the charset iso-8859-5 and the encoding CTEXT.

---
Ken'ichi HANDA
handa@m17n.org