From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Emacs 23 character code space Date: Tue, 04 Nov 2008 16:35:10 +0900 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: ger.gmane.org 1225784144 20304 80.91.229.12 (4 Nov 2008 07:35:44 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 4 Nov 2008 07:35:44 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Nov 04 08:36:45 2008 connect(): Connection refused Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from mail-forward2.uio.no ([129.240.10.71]) by lo.gmane.org with esmtp (Exim 4.50) id 1KxGSs-0002oZ-RY for ged-emacs-devel@m.gmane.org; Tue, 04 Nov 2008 08:36:38 +0100 Original-Received: from exim by mail-out2.uio.no with local-bsmtp (Exim 4.69) (envelope-from ) id 1KxGRm-000584-6j for ged-emacs-devel@m.gmane.org; Tue, 04 Nov 2008 08:35:30 +0100 Original-Received: from mail-mx3.uio.no ([129.240.10.44]) by mail-out2.uio.no with esmtp (Exim 4.69) (envelope-from ) id 1KxGRm-000581-5Z for ged-emacs-devel@m.gmane.org; Tue, 04 Nov 2008 08:35:30 +0100 Original-Received: from lists.gnu.org ([199.232.76.165]) by mail-mx3.uio.no with esmtps (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from ) id 1KxGRl-00017d-H0 for ged-emacs-devel@m.gmane.org; Tue, 04 Nov 2008 08:35:30 +0100 Original-Received: from localhost ([127.0.0.1]:33592 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KxGRk-0001Xc-Ib for ged-emacs-devel@m.gmane.org; Tue, 04 Nov 2008 02:35:28 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KxGRg-0001XM-2i for emacs-devel@gnu.org; Tue, 04 Nov 2008 02:35:24 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KxGRd-0001We-Ju for emacs-devel@gnu.org; Tue, 04 Nov 2008 02:35:22 -0500 Original-Received: from [199.232.76.173] (port=35180 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KxGRd-0001Wb-ET for emacs-devel@gnu.org; Tue, 04 Nov 2008 02:35:21 -0500 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:59930) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KxGRY-0001F3-Nw; Tue, 04 Nov 2008 02:35:17 -0500 Original-Received: from rqsmtp2.aist.go.jp (rqsmtp2.aist.go.jp [150.29.254.123]) by mx1.aist.go.jp with ESMTP id mA47ZB0E029530; Tue, 4 Nov 2008 16:35:11 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp2.aist.go.jp by rqsmtp2.aist.go.jp with ESMTP id mA47ZA0o018451; Tue, 4 Nov 2008 16:35:11 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp2.aist.go.jp with ESMTP id mA47ZAHV014280; Tue, 4 Nov 2008 16:35:10 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken.m17n.org with local (Exim 4.69) (envelope-from ) id 1KxGRS-00087p-G2; Tue, 04 Nov 2008 16:35:10 +0900 In-reply-to: (message from Eli Zaretskii on Mon, 03 Nov 2008 22:13:37 +0200) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/23.0.60 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) X-detected-operating-system: by monty-python.gnu.org: Solaris 9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org X-UiO-SPF-Received: Received-SPF: pass (mail-mx3.uio.no: domain of gnu.org designates 199.232.76.165 as permitted sender) client-ip=199.232.76.165; envelope-from=emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org; helo=lists.gnu.org; X-UiO-Spam-info: not spam, SpamAssassin (score=-4.0, required=5.0, autolearn=disabled, MISSING_SUBJECT=0.001,NO_RECEIVED=-0.001,RCVD_IN_DNSWL_MED=-4, uiobl=NO, uiouri=NO) X-UiO-Scanned: 02AE4979FD467BA8461001317D4542B6FA03AD83 X-UiO-SPAM-Test: remote_host: 199.232.76.165 spam_score: -39 maxlevel 200 minaction 2 bait 0 mail/h: 22 total 75672 max/h 424 blacklist 0 greylist 0 ratelimit 0 Xref: news.gmane.org gmane.emacs.devel:105326 Archived-At: In article , Eli Zaretskii writes: > Thanks, this definitely helps. Unfortunately, you worked from a > non-current version of nonascii.texi; I already modified the first > section heavily. Please take a look when you can: I intend to > downplay the unibyte stuff heavily, while the previous version gave > unibyte and multibyte almost equal coverage. Oops, I couldn't do "cvs update" until yesterday. I've just done it. > In any case, I will certainly use what you wrote. Thanks! It seems that your last change is upto "@defun unibyte string" (before @section Converting Text Representations). May I leave the work of reflecting what I wrote to the section before "Character Code" to you? I'll continue to fix the document after that section. > > @acronym{ASCII} characters occupy one > > byte, non-@acronym{ASCII} characters occupy two to five bytes > So I guess you agree that NEWS is not entirely correct saying that we > use UTF-8 internally: UTF-8 uses only 1 to 4 bytes, not 1 to 5. > Should I fix NEWS in this regard, saying that the internal > representation is based on UTF-8, but extends it to handle additional > characters? As far as I remember, "UTF-8" was, at first, a general mechanism to serialize a 31-bit unsigned value into byte stream (thus upto 6-byte sequence). But, at some point, it seems that it is restricted to the Unicode coverage (thus upto 4-byte sequence). I think it's a regression. Anyway, yes, it is better to mention that Emacs uses extended utf-8 (or original utf-8). --- Kenichi Handa handa@ni.aist.go.jp