From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Cecil Westerhof Newsgroups: gmane.lisp.guile.user Subject: Re: Multi-byte characters Date: Mon, 21 Jun 2010 16:20:03 +0200 Organization: Decebal Computing Message-ID: <87tyow31t8.fsf@linux-lqcw.site> References: <87631cr9r6.fsf@linux-lqcw.site> <87ocf44qth.fsf@linux-lqcw.site> <115748.55918.qm@web37907.mail.mud.yahoo.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1277129984 6555 80.91.229.12 (21 Jun 2010 14:19:44 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 21 Jun 2010 14:19:44 +0000 (UTC) To: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Mon Jun 21 16:19:38 2010 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OQhqa-0007gP-Qt for guile-user@m.gmane.org; Mon, 21 Jun 2010 16:19:37 +0200 Original-Received: from localhost ([127.0.0.1]:57229 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OQhqa-0001Zg-4v for guile-user@m.gmane.org; Mon, 21 Jun 2010 10:19:36 -0400 Original-Received: from [140.186.70.92] (port=39310 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OQhqT-0001Zb-ST for guile-user@gnu.org; Mon, 21 Jun 2010 10:19:31 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OQhqS-0003xC-CG for guile-user@gnu.org; Mon, 21 Jun 2010 10:19:29 -0400 Original-Received: from smtp-vbr13.xs4all.nl ([194.109.24.33]:4107) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OQhqS-0003wp-3W for guile-user@gnu.org; Mon, 21 Jun 2010 10:19:28 -0400 Original-Received: from linux-lqcw.site (84-53-123-169.wxdsl.nl [84.53.123.169]) by smtp-vbr13.xs4all.nl (8.13.8/8.13.8) with ESMTP id o5LEJP7p033648 for ; Mon, 21 Jun 2010 16:19:26 +0200 (CEST) (envelope-from Cecil@decebal.nl) X-Homepage: http://www.decebal.nl/ In-Reply-To: <115748.55918.qm@web37907.mail.mud.yahoo.com> (Mike Gran's message of "Mon, 21 Jun 2010 06:20:51 -0700 (PDT)") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) X-Virus-Scanned: by XS4ALL Virus Scanner X-detected-operating-system: by eggs.gnu.org: FreeBSD 4.6-4.9 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:7919 Archived-At: Op maandag 21 jun 2010 15:20 CEST schreef Mike Gran: >> From: Cecil Westerhof Cecil@decebal.nl > >> I am experimenting with Guile. It looks like the performance is not that >> good. But I continue. One of the things is multi-byte characters. I want >> to replace spaces with non breaking spaces. But Guile sees a non >> breaking space (=C2=A0) as two characters (when using string-length). Is >> there a way to let Guile see it as one character? > > Guile 1.8.x only had native support for 8-bit characters, so string-length > is going to return the byte length of the string. > > Recent versions of Guile 1.9.x should have reasonable multi-byte character > support, but, to get it to work, you need to declare your locale.=C2=A0 U= TF-8 > isn't necessarily assumed as default. > > You might have to call (setlocale LC_ALL "") at the top of your program, > or maybe explicitly set your port's encoding with > (set-port-encoding! port "UTF-8") As I understand it Guile 2.0 should be released in the near future, I wait for that version then. At the moment I am just playing with it, so it is not that important (at the moment). --=20 Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof