From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.bugs,gmane.lisp.guile.devel Subject: Re: UTF-8 regression in guile 1.9.5 Date: Sat, 09 Jan 2010 19:07:38 +0100 Message-ID: References: <3ae3aa420912061043y12a33f27ia3d2c298812ee358@mail.gmail.com> <437145.81403.qm@web37905.mail.mud.yahoo.com> <3ae3aa420912061133r3d2fb5b4w2b3ea31f3d05e701@mail.gmail.com> <600066.28581.qm@web37908.mail.mud.yahoo.com> <3ae3aa420912061243nfb7f54cv1fd14950849b67fe@mail.gmail.com> <188729.99650.qm@web37904.mail.mud.yahoo.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1263063168 22003 80.91.229.12 (9 Jan 2010 18:52:48 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 9 Jan 2010 18:52:48 +0000 (UTC) Cc: bug-guile@gnu.org, linasvepstas@gmail.com, Guile Development To: Mike Gran Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Sat Jan 09 19:52:40 2010 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1NTgQS-0000lj-08 for guile-bugs@m.gmane.org; Sat, 09 Jan 2010 19:52:40 +0100 Original-Received: from localhost ([127.0.0.1]:58154 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NTgQS-0003oL-IW for guile-bugs@m.gmane.org; Sat, 09 Jan 2010 13:52:40 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NTgQM-0003mu-RE for bug-guile@gnu.org; Sat, 09 Jan 2010 13:52:34 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NTgQI-0003Zo-12 for bug-guile@gnu.org; Sat, 09 Jan 2010 13:52:34 -0500 Original-Received: from [199.232.76.173] (port=36541 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NTgQH-0003Zc-S3; Sat, 09 Jan 2010 13:52:29 -0500 Original-Received: from a-pb-sasl-quonix.pobox.com ([208.72.237.25]:33885 helo=sasl.smtp.pobox.com) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NTgQG-0005WG-W7; Sat, 09 Jan 2010 13:52:29 -0500 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id D61E58F282; Sat, 9 Jan 2010 13:52:27 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=/bDUP0Nh7MUeDwutmMcGunURulo=; b=n4+wQB k8q5fwXBngqjfRMHx3TYuy9VBxs1H40cws7/MPuydD9DaC9JsMu7HhL6yQHLhvkr ylQTEMcPgM9zaoHMajAY52U9cdhcU3VdF63KRtRjp46Wes8eCPui2wNwRMa2s3ne JN87mzBiPrPl8AP2ZRh/G57C++crk1wXWb3Ws= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=ICr2GK734jTE7hsviAtnIJdfG1+oN6y8 HRG/ly5tUVldRrQqmRRCd3FRbzjsLDIu3em0d12ligoucVdjHdRLWrQcTjIgqre8 ApcaH56oOwJ8518JMZ87dbSJnZHo5QSfY7WJZJn4B/a7qmr7lKlFTyIJ+ddRJVvM w9wfOSakttc= Original-Received: from a-pb-sasl-quonix. (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 9FAF38F27F; Sat, 9 Jan 2010 13:52:24 -0500 (EST) Original-Received: from unquote (unknown [79.150.127.36]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTPSA id 3A4418F27E; Sat, 9 Jan 2010 13:52:19 -0500 (EST) In-Reply-To: <188729.99650.qm@web37904.mail.mud.yahoo.com> (Mike Gran's message of "Fri, 11 Dec 2009 07:05:55 -0800 (PST)") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (gnu/linux) X-Pobox-Relay-ID: 202576C0-FD50-11DE-9620-9D59EE7EF46B-02397024!a-pb-sasl-quonix.pobox.com X-detected-operating-system: by monty-python.gnu.org: Solaris 10 (beta) X-BeenThere: bug-guile@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:4433 gmane.lisp.guile.devel:9841 Archived-At: Hi, Reviving an old thread... On Fri 11 Dec 2009 16:05, Mike Gran writes: >> On Sun 06 Dec 2009 21:43, Linas Vepstas writes: >> >> > 2009/12/6 Mike Gran : >> >> >> >>> > need to call (setlocale LC_ALL "") >> >> >> >> But for Guile to store characters as codepoints, declaring a locale >> >> pretty much a requirement now. >> > >> > Would it make sense to add (setlocale LC_ALL "") to some default, >> > e.g. boot-9.scm ? > > If we always call setlocale, legacy code that used UTF-8 and other > non-Latin locales will just work. Legacy code that used strings to > contain binary data would break. > > (Of couse, UTF-8 strings only worked on Guile 1.8.x so long > as you either never looked at substrings or chars, or did > UTF-8 parsing yourself.) > > As it is now, the opposite is true: legacy code with strings > containing binary data will just work; strings containing non-8-bit > locale encoded strings will break. > > | 1.8.x | setlocale | > | Strings | called | Guile 2.0 > | contain | 1.8 | 2.0 | will > ----------------------------------------------------------------- > | ASCII | Y/N | Y/N | just work > ----------------------------------------------------------------- > | locale-encoded | Y/N | Y | just work > | strings | | | > ----------------------------------------------------------------- > | locale-encoded | Y/N | N | interpret string bytes as > | strings | | | Latin-1 > ----------------------------------------------------------------- > | binary data | Y/N | Y | if locale is Latin-1: just work > | | | | > | | | | if locale is not latin-1: > | | | | interpret string bytes using > | | | | locale encoding > ----------------------------------------------------------------- > | binary data | Y/N | N | just work > | | | | > > I think I prefer that the coder take the responsibility of calling > setlocale, but, I only think that because it is how C works. I'm used > to that convention. I would still prefer ponies and magic, but I realized: if we do a setlocale(LC_ALL, "") at the beginning, might that not change e.g. the floating point format, or some other locale-related variable, which would make Guile modules unreadable, or otherwise semantically different or invalid? I'm asking because I ran into this bug now: scheme@(guile-user)> ,pr (resolve-module '(gnome gtk)) Throw to key `wrong-type-arg' with args `("procedure-name" "Wrong type argument in position ~A: ~S" (1 #) (#))'. Entering the debugger. Type `bt' for a backtrace or `c' to continue. 0 debug> bt In current input: : 13 ERROR: cannot convert to output locale "NONE": ""dynamic-wind"" So I guess we need a special case for NONE there, or something. I really don't understand i18n/l10n. FWIW, it seems that both ruby and python require the user to call setlocale. Regards, Andy -- http://wingolog.org/