From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alex Shinn Newsgroups: gmane.lisp.guile.devel Subject: Re: O(1) accessors for UTF-8 backed strings Date: Wed, 16 Mar 2011 09:07:38 +0900 Message-ID: References: <486722.32491.qm@web37905.mail.mud.yahoo.com> <87aah1qoc4.fsf@netris.org> <87zkp1p84m.fsf@netris.org> <87ei6csb85.fsf@gnu.org> <87tyf7oebc.fsf_-_@netris.org> <87ipvkmlog.fsf@netris.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1300234074 26134 80.91.229.12 (16 Mar 2011 00:07:54 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 16 Mar 2011 00:07:54 +0000 (UTC) Cc: =?ISO-8859-1?Q?Ludovic_Court=E8s?= , guile-devel@gnu.org To: Mark H Weaver Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Mar 16 01:07:50 2011 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PzeHA-0005iR-V9 for guile-devel@m.gmane.org; Wed, 16 Mar 2011 01:07:45 +0100 Original-Received: from localhost ([127.0.0.1]:49650 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PzeHA-0000QJ-Cg for guile-devel@m.gmane.org; Tue, 15 Mar 2011 20:07:44 -0400 Original-Received: from [140.186.70.92] (port=57809 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PzeH7-0000QD-FF for guile-devel@gnu.org; Tue, 15 Mar 2011 20:07:42 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PzeH6-0006Uh-5n for guile-devel@gnu.org; Tue, 15 Mar 2011 20:07:41 -0400 Original-Received: from mail-fx0-f41.google.com ([209.85.161.41]:59190) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PzeH5-0006UW-UI; Tue, 15 Mar 2011 20:07:40 -0400 Original-Received: by fxm18 with SMTP id 18so1310979fxm.0 for ; Tue, 15 Mar 2011 17:07:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=BlpPJ9MKtZ2QDM+W1AKLYluGaTbVvznyocXJrsQ7r3M=; b=ULwqnC60iKU4klnVjzClU8en5BSPusU3vsM+31JEx38HL6+bRQqXFOHA9/ugMobBtC RqdacEui8HUvXpB8c/d7kkaZFhDyvkkX3m+9ZkdUuWUa/1BA5gC3UnRYe97gxILfPPbb ZXN95QkHjdbg+XDvpVyl8c99AzFMnZkFPhxDY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=lW7vzvLpJjg8L2SC9xl6V6os+rCTwT281UF+/hElxdyK1KKhT7AP8BZiBBpT/mR3Rj xdZgMjsI0X50I0nw9yiq9jgYV1s8IGu3WdoDe7ez2THMz9o+4bieF38guF5ui4jbXzU3 btOiPmz/0cOmkaESQFEX2fW8WHs70N3nPXChI= Original-Received: by 10.223.106.76 with SMTP id w12mr146426fao.104.1300234058274; Tue, 15 Mar 2011 17:07:38 -0700 (PDT) Original-Received: by 10.223.86.1 with HTTP; Tue, 15 Mar 2011 17:07:38 -0700 (PDT) In-Reply-To: <87ipvkmlog.fsf@netris.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 209.85.161.41 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:11877 Archived-At: On Wed, Mar 16, 2011 at 12:46 AM, Mark H Weaver wrote: > Alex Shinn wrote: >> On Sun, Mar 13, 2011 at 1:05 PM, Mark H Weaver wrote: >>> I just realized that it is possible to implement O(1) accessors for >>> UTF-8 backed strings. >> >> It's possible with several approaches, but not necessarily worth it: >> >> http://trac.sacrideo.us/wg/wiki/StringRepresentations > > Alex, can you please clarify your position? =A0I fear that readers of you= r > message might assume that you are against my proposal to store strings > internally in UTF-8. =A0Having read the text that you referenced above, I > suspect that you are in favor of using UTF-8 with O(n) string accessors. I didn't intend to make a recommendation either way, just point to a useful resource where people have collected ideas and data on the topic so you could make an informed decision. You are correct that I personally prefer simple UTF-8 with O(n) string accessors, which is why the Unicode support I added for Chicken does this, as does my own chibi-scheme. But the best string representation depends on your use cases. --=20 Alex