From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Camm Maguire Newsgroups: gmane.emacs.devel,gmane.lisp.gcl.devel Subject: Re: utf8 and emacs text/string multibyte representation Date: Thu, 30 Oct 2014 12:27:58 -0400 Message-ID: <87lhnxo73l.fsf@maguirefamily.org> References: <87wq7jxc7d.fsf@gnu.org> <87zjcfx985.fsf_-_@maguirefamily.org> <83mw8f0w08.fsf@gnu.org> <87oasu3m72.fsf@maguirefamily.org> <83bnou26is.fsf@gnu.org> <87bnotwsqn.fsf@maguirefamily.org> <83y4rxzgmm.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1414778957 12952 80.91.229.3 (31 Oct 2014 18:09:17 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 31 Oct 2014 18:09:17 +0000 (UTC) Cc: gcl-devel@gnu.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Oct 31 19:09:11 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XkGdK-0001XA-Ig for ged-emacs-devel@m.gmane.org; Fri, 31 Oct 2014 19:09:10 +0100 Original-Received: from localhost ([::1]:40476 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XkGdJ-0001GX-Vq for ged-emacs-devel@m.gmane.org; Fri, 31 Oct 2014 14:09:10 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:59859) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XkEHv-0000Rt-St for emacs-devel@gnu.org; Fri, 31 Oct 2014 11:39:51 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xjsac-0006TB-Sv for emacs-devel@gnu.org; Thu, 30 Oct 2014 12:28:52 -0400 Original-Received: from vms173023pub.verizon.net ([206.46.173.23]:24374) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XjsaX-0006Rz-8T; Thu, 30 Oct 2014 12:28:41 -0400 Original-Received: from localhost.m.enhanced.com ([173.61.191.70]) by vms173023.mailsrvcs.net (Oracle Communications Messaging Server 7.0.5.32.0 64bit (built Jul 16 2014)) with ESMTPA id <0NE9002XGN2RGM20@vms173023.mailsrvcs.net>; Thu, 30 Oct 2014 11:28:13 -0500 (CDT) X-CMAE-Score: 0 X-CMAE-Analysis: v=2.1 cv=GLe/yVJP c=1 sm=1 tr=0 a=/u9AJkq9Lu4W7WiJwJyTEw==:117 a=1r3tstjE1_UA:10 a=LdTvEE7h3esA:10 a=kj9zAlcOel0A:10 a=9N09Ue-cAAAA:8 a=85uBIQG4AAAA:8 a=oR5dmqMzAAAA:8 a=-9mUelKeXuEA:10 a=mDV3o1hIAAAA:8 a=5nhP66IPq5VgyZft9BgA:9 a=CjuIK1q_8ugA:10 Original-Received: from camm by localhost.m.enhanced.com with local (Exim 4.80) (envelope-from ) id 1XjsZq-0007my-QX; Thu, 30 Oct 2014 12:27:58 -0400 In-reply-to: <83y4rxzgmm.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 30 Oct 2014 18:06:41 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 206.46.173.23 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:176114 gmane.lisp.gcl.devel:8797 Archived-At: Greetings, and thanks so much for the feedback! Almost done -- Eli Zaretskii writes: >> From: Camm Maguire >> Cc: emacs-devel@gnu.org, gcl-devel@gnu.org >> Date: Thu, 30 Oct 2014 10:13:20 -0400 >> >> >> Does every string access in emacs proceed through the utf8 decoder? >> > >> > If you need to look at the character, yes. E.g., if you need some >> > property of the character, you need to index the appropriate table by >> > that character's codepoint. But in most operations that is not >> > needed. You just need to recognize several specific characters, like >> > the null character, the slash, etc., most of which are ASCII. >> > >> >> Do you allocate a fresh boxed character on each aref, or output an >> integer referring to a fixed ~2^22 sized table? > > I'm not sure what you mean by a "boxed character". A character in > Emacs is just an int. > Then how do you distinguish integers from characters at the lisp level? Take care, -- Camm Maguire camm@maguirefamily.org ========================================================================== "The earth is but one country, and mankind its citizens." -- Baha'u'llah