From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mike Gran Newsgroups: gmane.lisp.guile.bugs,gmane.lisp.guile.devel Subject: Re: UTF-8 regression in guile 1.9.5 Date: Fri, 11 Dec 2009 07:05:55 -0800 (PST) Message-ID: <188729.99650.qm@web37904.mail.mud.yahoo.com> References: <3ae3aa420912061043y12a33f27ia3d2c298812ee358@mail.gmail.com> <437145.81403.qm@web37905.mail.mud.yahoo.com> <3ae3aa420912061133r3d2fb5b4w2b3ea31f3d05e701@mail.gmail.com> <600066.28581.qm@web37908.mail.mud.yahoo.com> <3ae3aa420912061243nfb7f54cv1fd14950849b67fe@mail.gmail.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1260543990 13176 80.91.229.12 (11 Dec 2009 15:06:30 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 11 Dec 2009 15:06:30 +0000 (UTC) Cc: bug-guile@gnu.org, Guile Development To: Andy Wingo , linasvepstas@gmail.com Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Fri Dec 11 16:06:14 2009 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1NJ74O-0005pv-20 for guile-bugs@m.gmane.org; Fri, 11 Dec 2009 16:06:12 +0100 Original-Received: from localhost ([127.0.0.1]:48382 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NJ74N-0006Pu-Ku for guile-bugs@m.gmane.org; Fri, 11 Dec 2009 10:06:11 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NJ74F-0006OH-FG for bug-guile@gnu.org; Fri, 11 Dec 2009 10:06:03 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NJ749-0006DA-8X for bug-guile@gnu.org; Fri, 11 Dec 2009 10:06:01 -0500 Original-Received: from [199.232.76.173] (port=55994 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NJ749-0006Cl-0o for bug-guile@gnu.org; Fri, 11 Dec 2009 10:05:57 -0500 Original-Received: from web37904.mail.mud.yahoo.com ([209.191.91.166]:20450) by monty-python.gnu.org with smtp (Exim 4.60) (envelope-from ) id 1NJ748-0001tB-Dz for bug-guile@gnu.org; Fri, 11 Dec 2009 10:05:56 -0500 Original-Received: (qmail 1231 invoked by uid 60001); 11 Dec 2009 15:05:55 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1260543955; bh=PZRVGzwnj+/16Nfy315FnWjmMl3rknwB2t14czJ6obg=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=B2ap5jAIE42yCeLJQ4A7nJ9I6wZws4cG0V5IYnOGQRqf5eiBRjQbz1NbSRrJrOHhhPRHMfIV8XnPiFO5Tvwyuj4lNOWADTrezVI3JgX7SyfQqbPQsjOgtQTO9Jqq/N2BMwnt2WgSpVrk/SaMil+0zKJLokEQ7fvF8aWP/spIvf8= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=dXuDGRNManYsbOfw9UDamF98bL/AvxVWPgePLwQSzepgorst8pd9GVUC7VwWStVpRKjvNR2BIuhXRv3hd+Gln7k7geopuy0wMnL4fY40HUj1xMRnNfXUUftFHss9T1VrIOkd6fRC09mVZruWep6YkRcCg7jokuZrioXjiURMpgA=; X-YMail-OSG: sT8aJQQVM1mYyXHwQfI1gJSo6Q0nvBwtfjz0GFgeBT30edGh668.yeUgkMuJOuXVcyNDRFoNwNFgYIQftgA5YSkDtvcA4X1RYabd4XfCVYcCSPoiGZJHouONxG8j6TuaK17KcZtHJOrYZIMSUfAbEgJifKjVe6AqJ43L9m2p5ZykaR1ts__I1RiIcbJdZFZ16n5_aVDAtaBcb4Li1ypIXcARUPbjk48ptplpqWKEnj3jYfpKnyH.ID1swOTWvWpLlTBd.usdExXdwnM8MCS8dWtqr5chwSdCqVjl4NLbHjvP5YJ.1DrbJDsW8H35MxReKNZ237eIYc.1rLmSyKlqqnOVmOC37733YH5QxGoZ Original-Received: from [71.130.218.243] by web37904.mail.mud.yahoo.com via HTTP; Fri, 11 Dec 2009 07:05:55 PST X-Mailer: YahooMailRC/240.3 YahooMailWebService/0.8.100.260964 In-Reply-To: X-detected-operating-system: by monty-python.gnu.org: FreeBSD 6.x (1) X-BeenThere: bug-guile@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:4389 gmane.lisp.guile.devel:9756 Archived-At: > From: Andy Wingo > Hi, > > On Sun 06 Dec 2009 21:43, Linas Vepstas writes: > > > 2009/12/6 Mike Gran : > >> > >>> > need to call (setlocale LC_ALL "") > >> > >> But for Guile to store characters as codepoints, declaring a locale > >> pretty much a requirement now. > > > > Would it make sense to add (setlocale LC_ALL "") to some default, > > e.g. boot-9.scm ? > > Mike I admit I don't follow this completely. Does Linas' suggestion > make sense? I somehow thought that locales would magically just > work. If we always call setlocale, legacy code that used UTF-8 and other non-Latin locales will just work. Legacy code that used strings to contain binary data would break. (Of couse, UTF-8 strings only worked on Guile 1.8.x so long as you either never looked at substrings or chars, or did UTF-8 parsing yourself.) As it is now, the opposite is true: legacy code with strings containing binary data will just work; strings containing non-8-bit locale encoded strings will break. | 1.8.x | setlocale | | Strings | called | Guile 2.0 | contain | 1.8 | 2.0 | will ----------------------------------------------------------------- | ASCII | Y/N | Y/N | just work ----------------------------------------------------------------- | locale-encoded | Y/N | Y | just work | strings | | | ----------------------------------------------------------------- | locale-encoded | Y/N | N | interpret string bytes as | strings | | | Latin-1 ----------------------------------------------------------------- | binary data | Y/N | Y | if locale is Latin-1: just work | | | | | | | | if locale is not latin-1: | | | | interpret string bytes using | | | | locale encoding ----------------------------------------------------------------- | binary data | Y/N | N | just work | | | | I think I prefer that the coder take the responsibility of calling setlocale, but, I only think that because it is how C works. I'm used to that convention. Thanks, Mike