From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Newsgroups: gmane.lisp.guile.devel Subject: Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword} Date: Mon, 06 Sep 2010 19:02:00 +0200 Message-ID: <877hiy3iwn.fsf@gnu.org> References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: dough.gmane.org 1283792556 11122 80.91.229.12 (6 Sep 2010 17:02:36 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 6 Sep 2010 17:02:36 +0000 (UTC) To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Mon Sep 06 19:02:35 2010 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Osf5X-0001nv-6S for guile-devel@m.gmane.org; Mon, 06 Sep 2010 19:02:35 +0200 Original-Received: from localhost ([127.0.0.1]:53452 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Osf5W-0005xc-Kt for guile-devel@m.gmane.org; Mon, 06 Sep 2010 13:02:34 -0400 Original-Received: from [140.186.70.92] (port=59394 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Osf5E-0005wK-Pw for guile-devel@gnu.org; Mon, 06 Sep 2010 13:02:18 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Osf5C-00061V-Eo for guile-devel@gnu.org; Mon, 06 Sep 2010 13:02:15 -0400 Original-Received: from lo.gmane.org ([80.91.229.12]:49700) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Osf5C-00061F-40 for guile-devel@gnu.org; Mon, 06 Sep 2010 13:02:14 -0400 Original-Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1Osf58-0001Xq-9R for guile-devel@gnu.org; Mon, 06 Sep 2010 19:02:10 +0200 Original-Received: from yoda.fdn.fr ([80.67.169.18]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 06 Sep 2010 19:02:10 +0200 Original-Received: from ludo by yoda.fdn.fr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 06 Sep 2010 19:02:10 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 51 Original-X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: yoda.fdn.fr X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 20 Fructidor an 218 de la =?iso-8859-1?Q?R=E9volutio?= =?iso-8859-1?Q?n?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) Cancel-Lock: sha1:rmSCmVksr/OlO1CMoHEo3dBQQjY= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:10876 Archived-At: Hello, Andy Wingo writes: > However, when we have literals in C source code, I think this strategy > is incorrect. I write my C source code in UTF-8 or in ISO-8859-1, but if > the user is running in another locale, they will not load my > strings/symbols/keywords correctly. Actually locale encodings are typically ASCII-compatible (info "(libunistring) Locale encodings"), so it’s rarely (never?) a problem in practice. > The solution is to use functions that specify the locale. We don't have > those yet, but we do have the capability to write them > now. Specifically: > > scm_from_utf8_string > scm_from_utf8_symbol > scm_from_utf8_keyword > > scm_from_latin1_string > scm_from_latin1_symbol > scm_from_latin1_keyword The ‘latin1’ family should be easy to implement and that’s what we’d use in our C code. [...] > For example, most GLib-based libraries expect utf-8 strings, but > Guile-GNOME ignorantly passes them the result of calling > scm_to_locale_string. Though this will work in UTF-8 locales, it's only > by accident. When using (system foreign), one can use: (bytevector->pointer (string->utf8 "foo")) or similar. Besides, there’s the undocumented ‘scm_from_stringn’ and the internal ‘scm_to_stringn’, which can convert from/to any encoding. I think they were initially kept internal because we weren’t quite sure about the API. Mike? Perhaps it’d be enough to make these two functions public and documented, and add the ‘latin1’ family? Thanks, Ludo’.