From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Emacs 23 character code space Date: Wed, 26 Nov 2008 22:26:03 +0200 Message-ID: References: Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: ger.gmane.org 1227731262 4864 80.91.229.12 (26 Nov 2008 20:27:42 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 26 Nov 2008 20:27:42 +0000 (UTC) Cc: emacs-devel@gnu.org To: Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Nov 26 21:28:44 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1L5Qzq-0003zQ-CZ for ged-emacs-devel@m.gmane.org; Wed, 26 Nov 2008 21:28:26 +0100 Original-Received: from localhost ([127.0.0.1]:34917 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L5Qyg-0004dA-L0 for ged-emacs-devel@m.gmane.org; Wed, 26 Nov 2008 15:27:14 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L5QyU-0004Sd-La for emacs-devel@gnu.org; Wed, 26 Nov 2008 15:27:02 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L5QyS-0004Qz-EA for emacs-devel@gnu.org; Wed, 26 Nov 2008 15:27:02 -0500 Original-Received: from [199.232.76.173] (port=37965 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L5QyS-0004Qd-56 for emacs-devel@gnu.org; Wed, 26 Nov 2008 15:27:00 -0500 Original-Received: from mtaout6.012.net.il ([84.95.2.16]:28296) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L5QyR-0007YH-PE for emacs-devel@gnu.org; Wed, 26 Nov 2008 15:27:00 -0500 Original-Received: from conversion-daemon.i-mtaout6.012.net.il by i-mtaout6.012.net.il (HyperSendmail v2007.08) id <0KAY00300JBKM200@i-mtaout6.012.net.il> for emacs-devel@gnu.org; Wed, 26 Nov 2008 22:28:01 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([77.127.156.55]) by i-mtaout6.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0KAY003DLJIOHTB0@i-mtaout6.012.net.il>; Wed, 26 Nov 2008 22:28:01 +0200 (IST) In-reply-to: X-012-Sender: halo1@inter.net.il X-detected-operating-system: by monty-python.gnu.org: Solaris 10 (1203?) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:106205 Archived-At: > From: Kenichi Handa > CC: eliz@gnu.org, emacs-devel@gnu.org > Date: Wed, 26 Nov 2008 13:58:26 +0900 > > I'll explain it a little bit more. To decode a character > sequence to a byte sequence, Emacs actually does two kinds > of decoding as below: > > (1) (2) > characters <-----> (charset code-point) pairs <-----> bytes Can you give a couple of examples, for some popular charsets, and how we decode bytes into characters thru these pairs of charsets and code points? > For the decoding of (1), Emacs uses infomaiton of coding > system to decide which charset to use, and then uses > informaiton of the selected charset to get a code point. > > For the decoding of (2) Emacs uses only information of > coding system. Thanks. What confuses me is that, roughly, there's a charset in Emacs 23 for every coding-system, and they both have almost identical names. For example, the code point of a-umlaut in the iso-8859-1 charset is exactly identical to the byte value produced by encoding that character with iso-8859-1 coding-system. So I wonder why we need both in Emacs. Why can't we, for example, decode bytes directly into Emacs characters?