unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
From: Zefram <zefram@fysh.org>
To: 20822@debbugs.gnu.org
Subject: bug#20822: environment mangled by locale
Date: Fri, 4 Mar 2016 23:22:30 +0000	[thread overview]
Message-ID: <20160304232230.GA13009@fysh.org> (raw)
In-Reply-To: <20150616041736.GA2718@fysh.org>

I wrote:
>There's an obvious parallel with reading data from an input port.
>If setlocale is called, then input is by default decoded according
>to locale, including the very lossy ASCII decode for C/POSIX.  But if
>setlocale has not been called, then input is by default decoded according
>to ISO-8859-1, preserving the actual octets.  It would probably be most
>sensible that, if setlocale hasn't been called, getenv should likewise
>decode according to ISO-8859-1.  It might also be sensible to offer
>some explicit control over the encoding to be used with the environment,
>just as I/O ports have a concept of per-port selected encoding.

In the light of what I've learned recently about Guile's locale handling,
this needs some revision.  What I thought was a well-defined "setlocale
not called" state is a mirage.  The encoding of ports is not reliably
fixed at ISO-8859-1; per bug#22910 it can be affected by ostensibly
read-only calls to setlocale, and seems to be only accidentally
ISO-8859-1 until that's done.  So that's not a good model.  Due to the
GUILE_INSTALL_LOCALE mechanism, a program wanting no locale selected
can't just never call setlocale in write mode.  So setlocale not having
been called is not really available as a way to control anything.

So it would seem to be necessary to use some explicit control of character
encoding for environment access.  (This must be control of encoding
per se, not merely of which locale to use for environment access,
because, as I noted in the original report, there's no guarantee of a
locale with a suitable encoding.)  This could be an optional parameter
to the environment access functions, or a settable variable that takes
precedence over locale to determine encoding for all environment access.
The latter would match the encoding model used by ports.

-zefram





  parent reply	other threads:[~2016-03-04 23:22 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-16  4:17 bug#20822: environment mangled by locale Zefram
2015-06-16  6:26 ` John Darrington
2015-06-16 20:03   ` Andreas Rottmann
2015-06-16 20:50     ` John Darrington
2016-03-04 23:22 ` Zefram [this message]
2016-06-24  5:57 ` Andy Wingo
2016-06-26  1:10   ` Mark H Weaver
2016-06-26 10:33     ` Zefram

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160304232230.GA13009@fysh.org \
    --to=zefram@fysh.org \
    --cc=20822@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).