From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Kenichi Handa <handa@m17n.org>
Newsgroups: gmane.emacs.devel
Subject: Re: raw-byte and char-table
Date: Thu, 26 Aug 2010 15:48:02 +0900
Message-ID: <tl78w3t50pp.fsf@m17n.org>
References: <AANLkTinaF1Z2Rvp_sDv-ciHNjY4=eoW7e46KS3_yN-Hh@mail.gmail.com>
	<tl7bp8q3v3b.fsf@m17n.org>
	<AANLkTi=iQqseE5irbKxHCrd5NxGmEH-db+G4FatGZAP4@mail.gmail.com>
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: dough.gmane.org 1282805305 4352 80.91.229.12 (26 Aug 2010 06:48:25 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Thu, 26 Aug 2010 06:48:25 +0000 (UTC)
Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
To: MON KEY <monkey@sandpframing.com>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Aug 26 08:48:23 2010
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1OoWG4-00023R-Gs
	for ged-emacs-devel@m.gmane.org; Thu, 26 Aug 2010 08:48:20 +0200
Original-Received: from localhost ([127.0.0.1]:49055 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1OoWG3-0007fq-Ue
	for ged-emacs-devel@m.gmane.org; Thu, 26 Aug 2010 02:48:20 -0400
Original-Received: from [140.186.70.92] (port=52819 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1OoWFx-0007fl-Dm
	for emacs-devel@gnu.org; Thu, 26 Aug 2010 02:48:14 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <handa@m17n.org>) id 1OoWFu-0006Hh-KV
	for emacs-devel@gnu.org; Thu, 26 Aug 2010 02:48:13 -0400
Original-Received: from mx1.aist.go.jp ([150.29.246.133]:37508)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <handa@m17n.org>) id 1OoWFu-0006HG-41
	for emacs-devel@gnu.org; Thu, 26 Aug 2010 02:48:10 -0400
Original-Received: from rqsmtp2.aist.go.jp (rqsmtp2.aist.go.jp [150.29.254.123])
	by mx1.aist.go.jp  with ESMTP id o7Q6m423022215;
	Thu, 26 Aug 2010 15:48:04 +0900 (JST) env-from (handa@m17n.org)
Original-Received: from smtp4.aist.go.jp
	by rqsmtp2.aist.go.jp  with ESMTP id o7Q6m4aM029584;
	Thu, 26 Aug 2010 15:48:04 +0900 (JST) env-from (handa@m17n.org)
Original-Received: by smtp4.aist.go.jp  with ESMTP id o7Q6m36G028987;
	Thu, 26 Aug 2010 15:48:03 +0900 (JST) env-from (handa@m17n.org)
Original-Received: from handa by etlken with local (Exim 4.71)
	(envelope-from <handa@m17n.org>)
	id 1OoWFm-0008E6-UE; Thu, 26 Aug 2010 15:48:02 +0900
In-Reply-To: <AANLkTi=iQqseE5irbKxHCrd5NxGmEH-db+G4FatGZAP4@mail.gmail.com>
	(message from MON KEY on Thu, 26 Aug 2010 01:30:11 -0400)
X-detected-operating-system: by eggs.gnu.org: Solaris 9
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:129242
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/129242>

In article <AANLkTi=3DiQqseE5irbKxHCrd5NxGmEH-db+G4FatGZAP4@mail.gmail.com>=
, MON KEY <monkey@sandpframing.com> writes:

> > I'm not arguing that the syntax is cryptic.  What I want to
> > say is that it is difficult for one who reads the code to
> > understand what #x3FFFA0 means.

> So the syntax aren't the problem its their semantic denotation.

Sorry, but I can't parse the above sentence.  Could you
please paraphrase it?

> Regardless, right now it is all confusing (esp. for those of us less
> inclined to differentiating the multibyte/unibyte distinction).

I agree that the handling of raw-byte is very confusing.
The base is, I think, because we represent a character by an
integer value, and we must introduce character-object to
solve that confusion.  Unfortunately, it requires a huge
amount of work.  Until someone volunteer that work, we must
live with the current infrastructure of Emacs.

>>> This signals an error:
>>> =A0(unibyte-char-to-multibyte
>>> =A0 (unibyte-char-to-multibyte 160))
> >
> > Yes, but is it a problem?

> I would urge that it is a problem wherever the numerical denotation
> has no visible/nameable/printable corollary.

> Why should it be allowed to be problem if it can be avoided?

Conceptually we have "byte", "integer", and "character", and
#x3FFFA0 is both an integer and a character representing
byte 160.

Perhaps we should not call "byte" as "unibyte char", rename
the above funciton to "byte-to-char", and document it as:

(byte-to-char BYTE)
Convert the byte BYTE to a character representing BYTE.

Then it's clear that (byte-to-char (byte-to-char BYTE))
signals an error.

Likewise multibyte-char-to-unibyte =3D> char-to-byte:

(char-to-byte CH)
Convert the character CH to a byte.
If the character does not represent a byte, return -1.


By the way, we also have decode-char.

(decode-char 'eight-bit 160) =3D> #x3FFFA0

---
Kenichi Handa
handa@m17n.org