From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Kenichi Handa <handa@m17n.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Display of characters #xa0 and #xad in unibyte buffers
Date: Mon, 28 Sep 2009 20:24:24 +0900
Message-ID: <tl78wfzl3br.fsf@m17n.org>
References: <19131.35568.835627.216245@a1i15.kph.uni-mainz.de>
	<833a6bv30o.fsf@gnu.org>
	<19132.34451.565451.857731@a1ihome1.kph.uni-mainz.de>
	<83ws3ntmgv.fsf@gnu.org> <tl7fxa7lvqv.fsf@m17n.org>
	<831vlrsh6q.fsf@gnu.org>
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: ger.gmane.org 1254137096 22026 80.91.229.12 (28 Sep 2009 11:24:56 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Mon, 28 Sep 2009 11:24:56 +0000 (UTC)
Cc: ulm@gentoo.org, emacs-devel@gnu.org
To: Eli Zaretskii <eliz@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Sep 28 13:24:49 2009
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.50)
	id 1MsELY-0006LO-V6
	for ged-emacs-devel@m.gmane.org; Mon, 28 Sep 2009 13:24:49 +0200
Original-Received: from localhost ([127.0.0.1]:42989 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1MsELW-0004iE-2E
	for ged-emacs-devel@m.gmane.org; Mon, 28 Sep 2009 07:24:46 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1MsELQ-0004hp-Qp
	for emacs-devel@gnu.org; Mon, 28 Sep 2009 07:24:40 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1MsELL-0004gq-1x
	for emacs-devel@gnu.org; Mon, 28 Sep 2009 07:24:39 -0400
Original-Received: from [199.232.76.173] (port=56787 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1MsELK-0004gn-Uq
	for emacs-devel@gnu.org; Mon, 28 Sep 2009 07:24:34 -0400
Original-Received: from mx1.aist.go.jp ([150.29.246.133]:35322)
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <handa@m17n.org>)
	id 1MsELH-00007V-ER; Mon, 28 Sep 2009 07:24:32 -0400
Original-Received: from rqsmtp2.aist.go.jp (rqsmtp2.aist.go.jp [150.29.254.123])
	by mx1.aist.go.jp  with ESMTP id n8SBOOfc024757;
	Mon, 28 Sep 2009 20:24:24 +0900 (JST) env-from (handa@m17n.org)
Original-Received: from smtp4.aist.go.jp
	by rqsmtp2.aist.go.jp  with ESMTP id n8SBOOEZ012833;
	Mon, 28 Sep 2009 20:24:24 +0900 (JST) env-from (handa@m17n.org)
Original-Received: by smtp4.aist.go.jp  with ESMTP id n8SBOO4Y013595;
	Mon, 28 Sep 2009 20:24:24 +0900 (JST) env-from (handa@m17n.org)
Original-Received: from handa by etlken with local (Exim 4.69)
	(envelope-from <handa@m17n.org>)
	id 1MsELA-0006w9-56; Mon, 28 Sep 2009 20:24:24 +0900
In-Reply-To: <831vlrsh6q.fsf@gnu.org> (message from Eli Zaretskii on Mon,
	28 Sep 2009 08:43:09 +0200)
X-detected-operating-system: by monty-python.gnu.org: Solaris 9
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:115718
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/115718>

In article <831vlrsh6q.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> > In article <83ws3ntmgv.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> write=
s:
> >=20
> > > > >> $ emacs -Q
> > > > >> M-x toggle-enable-multibyte-characters RET C-q 240 RET C-q 255 R=
ET
> > > > >>=20
> > > > >> The characters are displayed as "_-" (approximately).
> > > > >>=20
> > > > >> Shouldn't they be displayed as "\240\255", considering that thes=
e are
> > > > >> raw bytes with no specific meaning?
> > > >=20
> > > > > There are no ``raw bytes'' in a unibyte buffer.  Every byte there=
 is
> > > > > interpreted as a character, and shown as such.  This is the main
> > > > > feature of unibyte buffers; otherwise, who'd want them?
> >=20
> > I think the main feature of unibyte buffers is to handle
> > raw-bytes as is.

> How do we even know that they are raw bytes, and how do we
> distinguish, in a unibyte buffer, =FC from \374, say?  Just because they
> were inserted by C-q NNN or by some other mechanism?

They are not distinguished.

> > For those who want to see a raw-byte as a character of their locale
> > (language environment), we have
> > unibyte-display-via-language-environment.

> I thought bytes in unibyte buffers are always interpreted as
> characters of the locale, as Emacs 19 did.

Not really because we don't perform automatic
unibyte<->multibyte decoding/encoding anymore.  So, if we
cut #xC0 in a unibyte buffer and yank it in a multibyte
buffer, eight-bit character is inserted instead of U+00C0.

> Are you saying that they
> are by default always interpreted as raw bytes, unless
> unibyte-display-via-language-environment is set?

unibyte-display-via-language-environment just controls how
to display them, and it doesn't affect how they are
interpreted.

Actually, the interpretation of characters in a unnibyte
buffer is still inconsistent.  For instance,
skip-syntax-forward treats #x80..#xFF as characters
U+0080..U+00FF.  Thus #xC0 is a word-constituent and #xD7 is
a symbol.  We must fix it somehow.  But, how?  We currently
don't have a suitable syntax code for eight-bit chars.

---
Kenichi Handa
handa@m17n.org