From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Emacs 23 character code space Date: Wed, 26 Nov 2008 10:41:01 +0900 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: ger.gmane.org 1227663683 28134 80.91.229.12 (26 Nov 2008 01:41:23 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 26 Nov 2008 01:41:23 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Nov 26 02:42:25 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1L59Q8-0001as-5v for ged-emacs-devel@m.gmane.org; Wed, 26 Nov 2008 02:42:24 +0100 Original-Received: from localhost ([127.0.0.1]:44863 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L59Oy-0000WT-NS for ged-emacs-devel@m.gmane.org; Tue, 25 Nov 2008 20:41:12 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L59Ou-0000TI-Bq for emacs-devel@gnu.org; Tue, 25 Nov 2008 20:41:08 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L59Os-0000Pt-LJ for emacs-devel@gnu.org; Tue, 25 Nov 2008 20:41:07 -0500 Original-Received: from [199.232.76.173] (port=55598 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L59Os-0000Pe-Hv for emacs-devel@gnu.org; Tue, 25 Nov 2008 20:41:06 -0500 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:57115) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L59Op-0006aF-8W; Tue, 25 Nov 2008 20:41:03 -0500 Original-Received: from rqsmtp1.aist.go.jp (rqsmtp1.aist.go.jp [150.29.254.115]) by mx1.aist.go.jp with ESMTP id mAQ1f1wG020802; Wed, 26 Nov 2008 10:41:01 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp3.aist.go.jp by rqsmtp1.aist.go.jp with ESMTP id mAQ1f1iq021834; Wed, 26 Nov 2008 10:41:01 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp3.aist.go.jp with ESMTP id mAQ1f17q009662; Wed, 26 Nov 2008 10:41:01 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken.m17n.org with local (Exim 4.69) (envelope-from ) id 1L59On-0005aa-BP; Wed, 26 Nov 2008 10:41:01 +0900 In-reply-to: (message from Eli Zaretskii on Sat, 22 Nov 2008 20:25:52 +0200) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/23.0.60 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) X-detected-operating-system: by monty-python.gnu.org: Solaris 9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:106177 Archived-At: In article , Eli Zaretskii writes: > A character set is a set of characters, and it assigns a unique code > point to each character belonging to the set. Emacs decodes a > specific code point of a specific character set to an Emacs character. > Does this mean a character set is equivalent to a coding-system, > meaning that a coding-system is a mapping between a character set and > the Emacs internal codepoints? No. A coding-system is a mapping between a sequence of characters and a sequence of bytes. The byte sequence contains a byte not mapped to a character. For instance, iso-2022 uses escape sequence, UTF-16 uses surrogate pairs. > @defun charset-dimension charset > This function returns the dimension of @var{charset}. Here, dimension > means the number of bytes required to represent the highest code point > (not an Emacs character code) of a character. For example, the > dimension of @code{iso-8859-1} is one, the dimension of > @code{japanese-jisx0208} is two, and the dimension of @code{unicode} > is three. > @end defun > I decided not to document this. I think the concept of charset > dimension is too obscure to explain, and not really needed for Lisp > programs, unless they need to define a new charset, or display a > charset, and those are already done by Emacs infrastructure. Do you > see any problems with not documenting this function? I think no. > A translation table has two extra slots. The first is either > @code{nil} or a translation table that performs the reverse > translation; the second is the maximum number of characters to look up > for translation. > Could you please elaborate on the second extra slot: when and for what > purpose would there be a need to look up characters for translation? To enable sequence-to-char translation. See the description of make-translation-table-from-alist. --- Kenichi Handa handa@ni.aist.go.jp