From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.devel Subject: Re: Case mapping of sharp s Date: Fri, 20 Nov 2009 12:41:00 +0900 Message-ID: <87tywp7tir.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4B05A11F.5000700@gmx.de> <87iqd6gmpk.fsf@lola.goethe.zz> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1258688101 4318 80.91.229.12 (20 Nov 2009 03:35:01 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 20 Nov 2009 03:35:01 +0000 (UTC) Cc: emacs-devel@gnu.org To: David Kastrup Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Nov 20 04:34:53 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1NBKGq-0003Yo-F4 for ged-emacs-devel@m.gmane.org; Fri, 20 Nov 2009 04:34:52 +0100 Original-Received: from localhost ([127.0.0.1]:42086 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NBKGp-0005IN-T9 for ged-emacs-devel@m.gmane.org; Thu, 19 Nov 2009 22:34:51 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NBKGU-000532-1H for emacs-devel@gnu.org; Thu, 19 Nov 2009 22:34:30 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NBKGP-00050c-E1 for emacs-devel@gnu.org; Thu, 19 Nov 2009 22:34:29 -0500 Original-Received: from [199.232.76.173] (port=47069 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NBKGP-00050U-1E for emacs-devel@gnu.org; Thu, 19 Nov 2009 22:34:25 -0500 Original-Received: from mtps01.sk.tsukuba.ac.jp ([130.158.97.223]:49285) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NBKGN-0005f1-EL; Thu, 19 Nov 2009 22:34:23 -0500 Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp [130.158.99.156]) by mtps01.sk.tsukuba.ac.jp (Postfix) with ESMTP id 9EA3E1537BA; Fri, 20 Nov 2009 12:34:22 +0900 (JST) Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000) id 2D3371A25EE; Fri, 20 Nov 2009 12:41:00 +0900 (JST) In-Reply-To: <87iqd6gmpk.fsf@lola.goethe.zz> X-Mailer: VM 8.0.12-devo-585 under 21.5 (beta29) "garbanzo" d20e0a45a4b2 XEmacs Lucid (x86_64-unknown-linux) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:117320 Archived-At: David Kastrup writes: > > But maybe we're doing something silly somewhere. > > The Emacs 22 multibyte scheme likely had worse properties for reverse > searching. So maybe something might be simplified nowadays. Nope. The basic nature of the representation and even algorithms are the same. The main difference is that the leading-byte to character length map in Mule coding is somewhat arbitrary, while in UTF-8 there's an algorithm for computing it. In both cases, the sane algorithm is to keep a 256-entry table of corresponding lengths and use the octet as an index into that table.