From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Chris Vine Newsgroups: gmane.lisp.guile.user Subject: Re: gh_repl Date: Tue, 10 Jan 2012 20:35:09 +0000 Message-ID: <20120110203509.162c9340@laptop.homenet> References: <1316922872.25009.YahooMailNeo@web37901.mail.mud.yahoo.com> <87wr907pa1.fsf@pobox.com> <1326132716.20961.YahooMailNeo@web37903.mail.mud.yahoo.com> <87obuca837.fsf@netris.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1326227721 15106 80.91.229.12 (10 Jan 2012 20:35:21 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 10 Jan 2012 20:35:21 +0000 (UTC) Cc: Guile User To: Mark H Weaver Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Tue Jan 10 21:35:16 2012 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1RkiPZ-00028c-Cf for guile-user@m.gmane.org; Tue, 10 Jan 2012 21:35:13 +0100 Original-Received: from localhost ([::1]:60041 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RkiPY-0000Tm-Q9 for guile-user@m.gmane.org; Tue, 10 Jan 2012 15:35:12 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]:54668) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RkiPT-0000RF-9d for guile-user@gnu.org; Tue, 10 Jan 2012 15:35:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RkiPQ-00044O-UN for guile-user@gnu.org; Tue, 10 Jan 2012 15:35:07 -0500 Original-Received: from avasout02.plus.net ([212.159.14.17]:44410) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RkiPQ-00041I-Gu for guile-user@gnu.org; Tue, 10 Jan 2012 15:35:04 -0500 Original-Received: from laptop.homenet ([91.125.225.107]) by avasout02 with smtp id Kwaz1i00M2KfM3P01wb0rK; Tue, 10 Jan 2012 20:35:01 +0000 X-CM-Score: 0.00 X-CNFS-Analysis: v=2.0 cv=EZtKsYaC c=1 sm=1 a=yLWESVGleYeqHKyaTrffLw==:17 a=8nJEP1OIZ-IA:10 a=DeTtaYkFAAAA:8 a=CjxXgO3LAAAA:8 a=mDV3o1hIAAAA:8 a=wmZ9yiOAW2cjQqBsiLkA:9 a=wPNLvfGTeEIA:10 a=CWTw7EdznawA:10 a=BiDrKXtPeRMA:10 a=rC2wZJ5BpNYA:10 a=yLWESVGleYeqHKyaTrffLw==:117 Original-Received: from laptop.homenet (IDENT:1000@localhost [127.0.0.1]) by laptop.homenet (8.14.4/8.14.4) with ESMTP id q0AKZ9ZP016722; Tue, 10 Jan 2012 20:35:10 GMT In-Reply-To: <87obuca837.fsf@netris.org> X-Mailer: Claws Mail 3.8.0 (GTK+ 2.24.8; i686-pc-linux-gnu) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 212.159.14.17 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:9124 Archived-At: On Mon, 09 Jan 2012 16:18:04 -0500 Mark H Weaver wrote: > Mike Gran writes: > > =A0=A0 scm_from_locale_symbol ("scheme")); >=20 > Note that it's good practice to always use `scm_from_utf8_symbol' or > `scm_from_latin1_symbol' when the argument is a C string literal. The > choice of which (`utf8' or `latin1') depends on the encoding of your C > source file. Unless guile does something clever, I think it would depend on the encoding of the narrow character execution character set, which may not be the same as the source character set (=A75.2.1/1 and 5.2.1.2/1 of C11). The execution character set (the encoding appearing in the binary) is implementation defined according to C99/11. If using gcc, http://gcc.gnu.org/onlinedocs/cpp/Character-sets.html suggests you should be OK in assuming UTF-8 as the default for the encoding of the narrow character execution character set, provided that -finput-charset is set to the correct input file encoding. You can use the -fexec-charset compiler flag to put something else in the binary though. The C standard refers to narrow and wide source character sets and narrow and wide execution character sets. gcc takes it a bit further and first converts the encoding of the input files passed to it into its own notion of the source character set. One curiosity is that if the input charset is not specified via -finput-charset, gcc appears to try to obtain the locale character set to perform this conversion: "-finput-charset=3Dcharset: Set the input character set, used for translation from the character set of the input file to the source character set used by GCC. If the locale does not specify, or GCC cannot get this information from the locale, the default is UTF-8. This can be overridden by either the locale or this command line option. Currently the command line option takes precedence if there's a conflict. charset can be any encoding supported by the system's iconv library routine." This means that with gcc source code may not be portable in the absence of -finput-charset being passed to the compiler. I avoid this by always using ASCII (ie English) for string literals in source files and obtaining translated text from gettext(), which deals with the conversion programatically and therefore portably. The overarching point is that, as you say, it would be wrong to assume the execution character set bears any relation to the locale encoding of a particular user on a particular machine. C++ works similarly (=A72.2/5 of C++11). We are not concerned with windows here, but if we were, I believe visual studio uses Windows ANSI as the narrow character execution character set in C and C++. Chris