From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: raw-byte and char-table Date: Thu, 26 Aug 2010 15:48:02 +0900 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1282805305 4352 80.91.229.12 (26 Aug 2010 06:48:25 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 26 Aug 2010 06:48:25 +0000 (UTC) Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org To: MON KEY Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Aug 26 08:48:23 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OoWG4-00023R-Gs for ged-emacs-devel@m.gmane.org; Thu, 26 Aug 2010 08:48:20 +0200 Original-Received: from localhost ([127.0.0.1]:49055 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OoWG3-0007fq-Ue for ged-emacs-devel@m.gmane.org; Thu, 26 Aug 2010 02:48:20 -0400 Original-Received: from [140.186.70.92] (port=52819 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OoWFx-0007fl-Dm for emacs-devel@gnu.org; Thu, 26 Aug 2010 02:48:14 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OoWFu-0006Hh-KV for emacs-devel@gnu.org; Thu, 26 Aug 2010 02:48:13 -0400 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:37508) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OoWFu-0006HG-41 for emacs-devel@gnu.org; Thu, 26 Aug 2010 02:48:10 -0400 Original-Received: from rqsmtp2.aist.go.jp (rqsmtp2.aist.go.jp [150.29.254.123]) by mx1.aist.go.jp with ESMTP id o7Q6m423022215; Thu, 26 Aug 2010 15:48:04 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp4.aist.go.jp by rqsmtp2.aist.go.jp with ESMTP id o7Q6m4aM029584; Thu, 26 Aug 2010 15:48:04 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp4.aist.go.jp with ESMTP id o7Q6m36G028987; Thu, 26 Aug 2010 15:48:03 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken with local (Exim 4.71) (envelope-from ) id 1OoWFm-0008E6-UE; Thu, 26 Aug 2010 15:48:02 +0900 In-Reply-To: (message from MON KEY on Thu, 26 Aug 2010 01:30:11 -0400) X-detected-operating-system: by eggs.gnu.org: Solaris 9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129242 Archived-At: In article = , MON KEY writes: > > I'm not arguing that the syntax is cryptic. What I want to > > say is that it is difficult for one who reads the code to > > understand what #x3FFFA0 means. > So the syntax aren't the problem its their semantic denotation. Sorry, but I can't parse the above sentence. Could you please paraphrase it? > Regardless, right now it is all confusing (esp. for those of us less > inclined to differentiating the multibyte/unibyte distinction). I agree that the handling of raw-byte is very confusing. The base is, I think, because we represent a character by an integer value, and we must introduce character-object to solve that confusion. Unfortunately, it requires a huge amount of work. Until someone volunteer that work, we must live with the current infrastructure of Emacs. >>> This signals an error: >>> =A0(unibyte-char-to-multibyte >>> =A0 (unibyte-char-to-multibyte 160)) > > > > Yes, but is it a problem? > I would urge that it is a problem wherever the numerical denotation > has no visible/nameable/printable corollary. > Why should it be allowed to be problem if it can be avoided? Conceptually we have "byte", "integer", and "character", and #x3FFFA0 is both an integer and a character representing byte 160. Perhaps we should not call "byte" as "unibyte char", rename the above funciton to "byte-to-char", and document it as: (byte-to-char BYTE) Convert the byte BYTE to a character representing BYTE. Then it's clear that (byte-to-char (byte-to-char BYTE)) signals an error. Likewise multibyte-char-to-unibyte =3D> char-to-byte: (char-to-byte CH) Convert the character CH to a byte. If the character does not represent a byte, return -1. By the way, we also have decode-char. (decode-char 'eight-bit 160) =3D> #x3FFFA0 --- Kenichi Handa handa@m17n.org