From: Mike Gran <spk121@yahoo.com>
To: Andy Wingo <wingo@pobox.com>, linasvepstas@gmail.com
Cc: bug-guile@gnu.org, Guile Development <guile-devel@gnu.org>
Subject: Re: UTF-8 regression in guile 1.9.5
Date: Fri, 11 Dec 2009 07:05:55 -0800 (PST) [thread overview]
Message-ID: <188729.99650.qm@web37904.mail.mud.yahoo.com> (raw)
In-Reply-To: <m3ljh9dcsd.fsf@pobox.com>
> From: Andy Wingo <wingo@pobox.com>
> Hi,
>
> On Sun 06 Dec 2009 21:43, Linas Vepstas writes:
>
> > 2009/12/6 Mike Gran :
> >>
> >>> > need to call (setlocale LC_ALL "")
> >>
> >> But for Guile to store characters as codepoints, declaring a locale
> >> pretty much a requirement now.
> >
> > Would it make sense to add (setlocale LC_ALL "") to some default,
> > e.g. boot-9.scm ?
>
> Mike I admit I don't follow this completely. Does Linas' suggestion
> make sense? I somehow thought that locales would magically just
> work.
If we always call setlocale, legacy code that used UTF-8 and other
non-Latin locales will just work. Legacy code that used strings to
contain binary data would break.
(Of couse, UTF-8 strings only worked on Guile 1.8.x so long
as you either never looked at substrings or chars, or did
UTF-8 parsing yourself.)
As it is now, the opposite is true: legacy code with strings
containing binary data will just work; strings containing non-8-bit
locale encoded strings will break.
| 1.8.x | setlocale |
| Strings | called | Guile 2.0
| contain | 1.8 | 2.0 | will
-----------------------------------------------------------------
| ASCII | Y/N | Y/N | just work
-----------------------------------------------------------------
| locale-encoded | Y/N | Y | just work
| strings | | |
-----------------------------------------------------------------
| locale-encoded | Y/N | N | interpret string bytes as
| strings | | | Latin-1
-----------------------------------------------------------------
| binary data | Y/N | Y | if locale is Latin-1: just work
| | | |
| | | | if locale is not latin-1:
| | | | interpret string bytes using
| | | | locale encoding
-----------------------------------------------------------------
| binary data | Y/N | N | just work
| | | |
I think I prefer that the coder take the responsibility of calling
setlocale, but, I only think that because it is how C works. I'm used
to that convention.
Thanks,
Mike
next prev parent reply other threads:[~2009-12-11 15:05 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-06 18:43 UTF-8 regression in guile 1.9.5 Linas Vepstas
2009-12-06 19:16 ` Mike Gran
2009-12-06 19:33 ` Linas Vepstas
2009-12-06 20:40 ` Mike Gran
2009-12-06 20:43 ` Linas Vepstas
2009-12-11 10:29 ` Andy Wingo
2009-12-11 15:05 ` Mike Gran [this message]
2009-12-11 15:40 ` Linas Vepstas
2009-12-11 22:50 ` Ludovic Courtès
2010-01-09 18:07 ` Andy Wingo
2010-01-10 22:00 ` Mike Gran
2010-01-11 13:38 ` Ludovic Courtès
2010-01-11 21:18 ` Andy Wingo
2010-01-12 11:25 ` Ludovic Courtès
2010-01-12 19:36 ` Andy Wingo
2010-01-12 22:26 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=188729.99650.qm@web37904.mail.mud.yahoo.com \
--to=spk121@yahoo.com \
--cc=bug-guile@gnu.org \
--cc=guile-devel@gnu.org \
--cc=linasvepstas@gmail.com \
--cc=wingo@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).