From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Thien-Thi Nguyen Newsgroups: gmane.lisp.guile.devel Subject: Re: Internal visibility Date: Wed, 11 Jun 2008 09:49:26 +0200 Message-ID: <87r6b42yyh.fsf@ambire.localdomain> References: <87k5i5d6ei.fsf@ossau.uklinux.net> <87lk2jhp0h.fsf@gnu.org> <87skwrce8y.fsf@ossau.uklinux.net> <87iqxledzz.fsf@gnu.org> <87lk2futg0.fsf@ossau.uklinux.net> <87fxslr1jr.fsf_-_@gnu.org> <878wxv5t7q.fsf@gnu.org> <87mym6dv6t.fsf@gnu.org> <49dd78620806091110v7a667787mef392fbf4446139d@mail.gmail.com> <87iqwhn3jw.fsf@gnu.org> <87d4mpsold.fsf@ambire.localdomain> <878wxdze2q.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1213170814 29461 80.91.229.12 (11 Jun 2008 07:53:34 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 11 Jun 2008 07:53:34 +0000 (UTC) Cc: guile-devel@gnu.org To: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Jun 11 09:54:17 2008 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1K6L9h-0001sr-5r for guile-devel@m.gmane.org; Wed, 11 Jun 2008 09:54:05 +0200 Original-Received: from localhost ([127.0.0.1]:56131 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1K6L8t-00045t-Kj for guile-devel@m.gmane.org; Wed, 11 Jun 2008 03:53:15 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1K6L8J-0003k1-5K for guile-devel@gnu.org; Wed, 11 Jun 2008 03:52:39 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1K6L8H-0003iz-7J for guile-devel@gnu.org; Wed, 11 Jun 2008 03:52:38 -0400 Original-Received: from [199.232.76.173] (port=49170 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1K6L8H-0003iu-2Y for guile-devel@gnu.org; Wed, 11 Jun 2008 03:52:37 -0400 Original-Received: from [151.61.141.189] (port=40567 helo=ambire.localdomain) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1K6L8G-0003YS-Dj for guile-devel@gnu.org; Wed, 11 Jun 2008 03:52:36 -0400 Original-Received: from ttn by ambire.localdomain with local (Exim 4.63) (envelope-from ) id 1K6L5D-0001hV-Hu; Wed, 11 Jun 2008 09:49:27 +0200 In-Reply-To: <878wxdze2q.fsf@gnu.org> ("Ludovic =?utf-8?Q?Court=C3=A8s=22'?= =?utf-8?Q?s?= message of "Tue, 10 Jun 2008 14:09:33 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) X-detected-kernel: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:7321 Archived-At: () ludo@gnu.org (Ludovic Court=C3=A8s) () Tue, 10 Jun 2008 14:09:33 +0200 Currently, Guile only supports `scm_to_locale_string ()', which means the returned C string is encoded in the current locale's encoding. Eventually, new functions may be added: `scm_to_utf8_string ()', etc. This was Marius' original plan [0], and I think it remains valid. Most plans are "valid" but not all plans are easy to live with. I think the encoding of a string (or buffer or "character" array (or subsequence thereof)) needs to be explicit; the encoding is not purely "internal" and to treat it as such will require hoop- jumping on both sides of the API. (How encoding support is implemented, on the other hand, is indeed an internal affair.) This is from observation of how Emacs attained multibyte-ness. Note: not just "how Emacs does it" but "how Emacs used to not do it and through time eventually came to do it". In PostgreSQL's multibyte support, the i/o can be tempered by setting the "client encoding". This can be changed cheaply (per request). Basing encoding on locale only is not fine-grained enough; setting the locale can be expensive and cause unrelated changes. See also GNU libc support (info "(libc) Character Set Handling"), which applies similar principles at a lower (library) level. All these programs chose not to expose many conversion functions in the programming interface. Instead, they expose few functions, each with an encoding parameter. That is surely a cleaner design. thi