From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: MON KEY <monkey@sandpframing.com>
Newsgroups: gmane.emacs.devel
Subject: Re: raw-byte and char-table
Date: Thu, 26 Aug 2010 01:30:11 -0400
Message-ID: <AANLkTi=iQqseE5irbKxHCrd5NxGmEH-db+G4FatGZAP4@mail.gmail.com>
References: <AANLkTinaF1Z2Rvp_sDv-ciHNjY4=eoW7e46KS3_yN-Hh@mail.gmail.com>
	<tl7bp8q3v3b.fsf@m17n.org>
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Trace: dough.gmane.org 1282800629 22773 80.91.229.12 (26 Aug 2010 05:30:29 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Thu, 26 Aug 2010 05:30:29 +0000 (UTC)
Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
To: Kenichi Handa <handa@m17n.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Aug 26 07:30:27 2010
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1OoV2g-00021b-VV
	for ged-emacs-devel@m.gmane.org; Thu, 26 Aug 2010 07:30:27 +0200
Original-Received: from localhost ([127.0.0.1]:58468 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1OoV2f-0005j6-PV
	for ged-emacs-devel@m.gmane.org; Thu, 26 Aug 2010 01:30:26 -0400
Original-Received: from [140.186.70.92] (port=47850 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1OoV2V-0005iY-Jj
	for emacs-devel@gnu.org; Thu, 26 Aug 2010 01:30:16 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <stan@derbycityprints.com>) id 1OoV2T-0000Wc-Tu
	for emacs-devel@gnu.org; Thu, 26 Aug 2010 01:30:15 -0400
Original-Received: from mail-ww0-f49.google.com ([74.125.82.49]:50797)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <stan@derbycityprints.com>) id 1OoV2T-0000WQ-OP
	for emacs-devel@gnu.org; Thu, 26 Aug 2010 01:30:13 -0400
Original-Received: by wwj40 with SMTP id 40so2025308wwj.30
	for <emacs-devel@gnu.org>; Wed, 25 Aug 2010 22:30:12 -0700 (PDT)
Original-Received: by 10.227.144.206 with SMTP id a14mr8351742wbv.112.1282800612043;
	Wed, 25 Aug 2010 22:30:12 -0700 (PDT)
Original-Received: by 10.216.65.140 with HTTP; Wed, 25 Aug 2010 22:30:11 -0700 (PDT)
In-Reply-To: <tl7bp8q3v3b.fsf@m17n.org>
X-Google-Sender-Auth: WSF2EQ0M0Hj6fiNQY5FXVEN_GeY
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:129239
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/129239>

On Wed, Aug 25, 2010 at 11:34 PM, Kenichi Handa <handa@m17n.org> wrote:
> In article <AANLkTinaF1Z2Rvp_sDv-ciHNjY4=3DeoW7e46KS3_yN-Hh@mail.gmail.co=
m>, MON KEY <monkey@sandpframing.com> writes:
>
>> > Number like #x3FFFA0 is so criptic. =C2=A0The function name
>> > unibyte-char-to-multibyte is also not ideal, but I think
>> > it's better than #x3FFFA0.
>
>> Maybe I am misunderstanding, but I think the `#x' and `#o' syntax is
>> not cryptic at all in the context.
>
> I'm not arguing that the syntax is cryptic.  What I want to
> say is that it is difficult for one who reads the code to
> understand what #x3FFFA0 means.

So the syntax aren't the problem its their semantic denotation.
This is the realm of Tarski and McDermott[1].

Regardless, right now it is all confusing (esp. for those of us less
inclined to differentiating the multibyte/unibyte distinction).

>
>> This signals an error:
>> =C2=A0(unibyte-char-to-multibyte
>> =C2=A0 (unibyte-char-to-multibyte 160))
>
> Yes, but is it a problem?

I would urge that it is a problem wherever the numerical denotation
has no visible/nameable/printable corollary.

Why should it be allowed to be problem if it can be avoided?

>
>> > We could provide a ?\NNN (or similar) notation for it. =C2=A0Similarly=
 to
>> > what we do for those bytes in multibyte strings.
>
>> Howsabout just this one for all of them:
>
>> =C2=A0`#\'
>
> Do you mean that making #\240 to be read as #x3FFFA0?
>

> Do you mean that making #\240 to be read as #x3FFFA0?

Half-jokingly, Yes.

(assuming the #\240 above is the the code-point 0xA0)

Though, I _also_ had these things in mind as well:

#\8-bit-240

or

#\byte-240

Which would allow referencing these chars by something other than a
numeric id.

E.g. in some other dialects of Lisp there is this type of behaviour:

CL-USER> #\	;<-that's a #x9 after the \
;=3D> #\Tab

CL-USER> #\ ;<- that's a #xa after the \
;=3D>
;  #\Newline

CL-USER> #\NO-BREAK_SPACE ;<-that's the char-name for #xa0
;=3D> #\NO-BREAK_SPACE      ;<-return is as per `identity'

CL-USER> (identity #\NO-BREAK_SPACE)
;=3D> #\NO-BREAK_SPACE

CL-USER> (princ #\=C2=A0)
;=3D>
;  #\NO-BREAK_SPACE

CL-USER> (prin1 #\=C2=A0)
;=3D> #\NO-BREAK_SPACE
;   #\NO-BREAK_SPACE

CL-USER> #\ ;<- That's a #x20 after the \
;=3D> #\

CL-USER> (char-code #\ )
32

CL-USER> (describe #\ )
;=3D> #\
;  [standard-char]
;
;  :_Char-code: 32
;  :_Char-name: Space
;  _

The idea being that where those chars in the above example don't have
visibly "printable" representations but the `#\' reader syntax _does_
recognize them either by char-name or a readable identity, e.g.:

CL-USER> (read-char)
=06
;=3D> #\Ack

Of course, introduction of this type of read syntax to Emacs lisp
would (or at least it should) imply extension to all characters
unibyte and multibyte...

Hence the ":)" smiley in my previous response to Stefan.


[1] McDermott, Drew (1978). Tarskian semantics, or no notation without
    denotation. Cognitive Science 2:277-82.

--
/s_P\