From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Zefram Newsgroups: gmane.lisp.guile.bugs Subject: bug#20822: environment mangled by locale Date: Fri, 4 Mar 2016 23:22:30 +0000 Message-ID: <20160304232230.GA13009@fysh.org> References: <20150616041736.GA2718@fysh.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1457133795 8629 80.91.229.3 (4 Mar 2016 23:23:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 4 Mar 2016 23:23:15 +0000 (UTC) To: 20822@debbugs.gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Sat Mar 05 00:23:11 2016 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1abz3u-0003zQ-IL for guile-bugs@m.gmane.org; Sat, 05 Mar 2016 00:23:10 +0100 Original-Received: from localhost ([::1]:43991 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abz3t-0002TQ-T6 for guile-bugs@m.gmane.org; Fri, 04 Mar 2016 18:23:09 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:47206) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abz3p-0002T5-Mx for bug-guile@gnu.org; Fri, 04 Mar 2016 18:23:06 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1abz3m-00049W-Fl for bug-guile@gnu.org; Fri, 04 Mar 2016 18:23:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:37053) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abz3m-00049R-C3 for bug-guile@gnu.org; Fri, 04 Mar 2016 18:23:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84) (envelope-from ) id 1abz3m-0005RR-82 for bug-guile@gnu.org; Fri, 04 Mar 2016 18:23:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Zefram Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Fri, 04 Mar 2016 23:23:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 20822 X-GNU-PR-Package: guile X-GNU-PR-Keywords: Original-Received: via spool by 20822-submit@debbugs.gnu.org id=B20822.145713375620873 (code B ref 20822); Fri, 04 Mar 2016 23:23:02 +0000 Original-Received: (at 20822) by debbugs.gnu.org; 4 Mar 2016 23:22:36 +0000 Original-Received: from localhost ([127.0.0.1]:34180 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1abz3M-0005Qa-Fj for submit@debbugs.gnu.org; Fri, 04 Mar 2016 18:22:36 -0500 Original-Received: from river.fysh.org ([87.98.248.19]:53708 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1abz3K-0005QS-Mq for 20822@debbugs.gnu.org; Fri, 04 Mar 2016 18:22:35 -0500 Original-Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian)) id 1abz3G-00042J-N7; Fri, 04 Mar 2016 23:22:30 +0000 Content-Disposition: inline In-Reply-To: <20150616041736.GA2718@fysh.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:7980 Archived-At: I wrote: >There's an obvious parallel with reading data from an input port. >If setlocale is called, then input is by default decoded according >to locale, including the very lossy ASCII decode for C/POSIX. But if >setlocale has not been called, then input is by default decoded according >to ISO-8859-1, preserving the actual octets. It would probably be most >sensible that, if setlocale hasn't been called, getenv should likewise >decode according to ISO-8859-1. It might also be sensible to offer >some explicit control over the encoding to be used with the environment, >just as I/O ports have a concept of per-port selected encoding. In the light of what I've learned recently about Guile's locale handling, this needs some revision. What I thought was a well-defined "setlocale not called" state is a mirage. The encoding of ports is not reliably fixed at ISO-8859-1; per bug#22910 it can be affected by ostensibly read-only calls to setlocale, and seems to be only accidentally ISO-8859-1 until that's done. So that's not a good model. Due to the GUILE_INSTALL_LOCALE mechanism, a program wanting no locale selected can't just never call setlocale in write mode. So setlocale not having been called is not really available as a way to control anything. So it would seem to be necessary to use some explicit control of character encoding for environment access. (This must be control of encoding per se, not merely of which locale to use for environment access, because, as I noted in the original report, there's no guarantee of a locale with a suitable encoding.) This could be an optional parameter to the environment access functions, or a settable variable that takes precedence over locale to determine encoding for all environment access. The latter would match the encoding model used by ports. -zefram