unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Bruno Haible <bruno@clisp.org>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guile-devel@gnu.org
Subject: Re: Accessing the environment's locale encoding settings
Date: Wed, 16 Nov 2011 03:00:38 +0100	[thread overview]
Message-ID: <201111160300.38410.bruno@clisp.org> (raw)
In-Reply-To: <877h30exfk.fsf@gnu.org>

[Dropping bug-libunistring from the CC.]

Hi Ludo',

> Should we be checking for charset aliases?

Yes, without the system dependent aliases table the locale_charset()
function is buggy on nearly all platforms. Cf. gnulib/lib/config.charset.

> In Guile, strings coming from the C world are assumed to be encoded in
> the current locale encoding.  Like in C, the current locale is set using
> ‘setlocale’, and it’s up to the user to write (setlocale LC_ALL "") to
> set the locale according to the relevant environment variables.
> 
> The problem comes with command-line arguments: the user hasn’t yet had a
> chance to call ‘setlocale’, yet they most likely have to be converted
> from locale encoding. ...

I would recommend to have setlocale(...) happen *before* the command-line
arguments are parsed, not *after*. For two reasons:
  1) The parsing of command-line arguments can provoke errors, and errors
     should be displayed in the user's language, that is, depend on $LANG,
     $LC_MESSAGES, $LC_ALL.
  2) As you noticed, if setlocale(...) happens too late, you want to
     simulate the effects "as if" setlocale(LC_ALL, "") had been called.
     But you have thought only about the locale encoding (part of the
     LC_CTYPE category of the locale), not about LC_MESSAGES which is needed
     when you print an error message.

You wrote:
> > Unfortunately, I don't see a way for the user to call setlocale before a
> > Guile script converts the command-line arguments to Scheme strings, at
> > least not without providing their own `main' function in C.
>
> Hmm, very good point.

That is precisely the point. Only in C, C++, Objective C, PHP, and Guile,
it is the user's responsibility to set the locale. Look at the many
internationalization samples ("hello world" samples) in GNU gettext:
In all other languages (and even many GUI toolkits based on C, C++, or
Objective C) the setlocale call is implicit.

The user should *not* have to worry about conversion of strings from/to
locale encoding, because
  1) This is what people expect from a scripting language nowadays.
  2) In Guile strings are sequences of Unicode characters [1][2].

The fact that in C and C++ the default locale inside a program (that is,
the locale in effect when the program is started) is *not* the locale
specified by the user is only due to backward compatibility:
  - In C, because C started as a system programming language and the
    locale facilities were not there in the beginning,
  - In C++, because C++ has strong backward compatibility links with C.

So my suggestion is to do (setlocale LC_ALL "") as part of the Guile
initialization, very early. Yes, this might lead to some complexity
in the Guile implementation if you have the concept of locale also at
the Guile level and need to make sure that the locale at the C level and
the locale at the Guile level are consistent as soon as the latter is
defined. But this is manageable.

Bruno

[1] http://www.gnu.org/software/guile/manual/html_node/Strings.html
[2] http://www.gnu.org/software/guile/manual/html_node/Characters.html
-- 
In memoriam Kurt Gerron <http://en.wikipedia.org/wiki/Kurt_Gerron>



  reply	other threads:[~2011-11-16  2:00 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-16  0:13 Accessing the environment's locale encoding settings Ludovic Courtès
2011-11-16  2:00 ` Bruno Haible [this message]
2011-11-16 10:35   ` Ludovic Courtès
2011-11-16 16:11     ` Noah Lavine
2011-11-16 16:32       ` Peter Brett
2011-11-18 22:17         ` Mark H Weaver
2011-11-20 16:55     ` Bruno Haible
2011-11-20 17:41       ` Ludovic Courtès
2011-11-20 19:44         ` Mike Gran
2011-11-23 23:28           ` Ludovic Courtès
2011-11-24  4:42             ` Mike Gran
2011-11-24 13:16             ` Peter Brett
2011-11-25  2:11             ` Mark H Weaver
2011-12-15 19:08               ` Ludovic Courtès
2011-11-20 20:12         ` Bruno Haible
2011-12-15  0:41           ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201111160300.38410.bruno@clisp.org \
    --to=bruno@clisp.org \
    --cc=guile-devel@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).