From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Eli Zaretskii <eliz@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Emacs 23 character code space
Date: Sat, 22 Nov 2008 18:28:13 +0200
Message-ID: <uk5aviv36.fsf@gnu.org>
References: <u63n7wmri.fsf@gnu.org> <E1KwoKX-0002Tk-Lp@etlken.m17n.org>
	<E1Kwyo4-0007Vt-Ai@etlken.m17n.org>
Reply-To: Eli Zaretskii <eliz@gnu.org>
NNTP-Posting-Host: lo.gmane.org
X-Trace: ger.gmane.org 1227371349 11035 80.91.229.12 (22 Nov 2008 16:29:09 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Sat, 22 Nov 2008 16:29:09 +0000 (UTC)
Cc: emacs-devel@gnu.org
To: Kenichi Handa <handa@m17n.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Nov 22 17:30:10 2008
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.50)
	id 1L3vN1-000352-NJ
	for ged-emacs-devel@m.gmane.org; Sat, 22 Nov 2008 17:30:07 +0100
Original-Received: from localhost ([127.0.0.1]:43481 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1L3vLs-0001pI-EG
	for ged-emacs-devel@m.gmane.org; Sat, 22 Nov 2008 11:28:56 -0500
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1L3vLC-0001RO-Tq
	for emacs-devel@gnu.org; Sat, 22 Nov 2008 11:28:14 -0500
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1L3vLC-0001QW-5A
	for emacs-devel@gnu.org; Sat, 22 Nov 2008 11:28:14 -0500
Original-Received: from [199.232.76.173] (port=46251 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1L3vLB-0001QN-Rn
	for emacs-devel@gnu.org; Sat, 22 Nov 2008 11:28:13 -0500
Original-Received: from mtaout3.012.net.il ([84.95.2.7]:10694)
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <eliz@gnu.org>) id 1L3vLB-0005IN-Dz
	for emacs-devel@gnu.org; Sat, 22 Nov 2008 11:28:13 -0500
Original-Received: from conversion-daemon.i_mtaout3.012.net.il by i_mtaout3.012.net.il
	(HyperSendmail v2004.12) id
	<0KAQ00D00TU4OG00@i_mtaout3.012.net.il> for emacs-devel@gnu.org;
	Sat, 22 Nov 2008 18:30:12 +0200 (IST)
Original-Received: from HOME-C4E4A596F7 ([77.126.14.29]) by i_mtaout3.012.net.il
	(HyperSendmail v2004.12) with ESMTPA id
	<0KAQ003C0TUBK801@i_mtaout3.012.net.il>;
	Sat, 22 Nov 2008 18:30:12 +0200 (IST)
In-reply-to: <E1Kwyo4-0007Vt-Ai@etlken.m17n.org>
X-012-Sender: halo1@inter.net.il
X-detected-operating-system: by monty-python.gnu.org: Solaris 9.1
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:105962
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/105962>

> From: Kenichi Handa <handa@m17n.org>
> CC: eliz@gnu.org, emacs-devel@gnu.org
> Date: Mon, 03 Nov 2008 21:45:20 +0900
> 
> I tried to rewrite nonascii.texi to clear the things.  I
> finished upto the "Character Code" section as attached.
> What do you think about it?

Thanks!

I have a few questions:

      Emacs can convert unibyte text to multibyte; it can also convert
    multibyte text to unibyte provided that the multibyte text contains
    only @acronym{ASCII} and 8-bit characters.

What exactly is meant here by ``8-bit characters''?  Do you mean
eight-bit raw bytes, or do you mean Unicode characters whose
codepoints are below 256?

      Converting unibyte text to multibyte text leaves @acronym{ASCII} characters
    unchanged, and converts 8-bit characters (codes 128 through 159) to
    the corresponding representation for multibyte text.

Again, by ``8-bit characters'' you mean raw 8-bit bytes here, right?

    @defun string-to-multibyte string
    This function returns a multibyte string containing the same sequence
    of characters as @var{string}.  If @var{string} is a multibyte string,
    it is returned unchanged.
    @end defun

I'm not sure I understand the effect of this function.  Does it decode
its argument, converting each byte to the corresponding internal
representation of the encoded single-byte character?  I think this is
not what it does, but then what does it do?

    @defun string-to-unibyte string
    This function returns a unibyte string containing the same sequence of
    characters as @var{string}.  It signals an error if @var{string}
    contains a non-@acronym{ASCII} character.  If @var{string} is a
    unibyte string, it is returned unchanged.
    @end defun

Since this function handles any non-ASCII characters lossily, when
would it be useful?

    @defun multibyte-char-to-unibyte char
    This convert the multibyte character @var{char} to a unibyte
    character.  If @var{char} is a non-@acronym{ASCII} character, the
    value is -1.
    @end defun

    @defun unibyte-char-to-multibyte char
    This convert the unibyte character @var{char} to a multibyte
    character.
    @end defun

Again, when are these functions useful?