From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Newsgroups: gmane.lisp.guile.devel Subject: Re: Unicode, ports and encoding Date: Tue, 17 Feb 2009 22:54:36 +0100 Message-ID: <87ocx0hgpv.fsf@gnu.org> References: <550226.89448.qm@web37908.mail.mud.yahoo.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1234907707 321 80.91.229.12 (17 Feb 2009 21:55:07 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 17 Feb 2009 21:55:07 +0000 (UTC) To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Tue Feb 17 22:56:23 2009 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1LZXvP-0004IA-LX for guile-devel@m.gmane.org; Tue, 17 Feb 2009 22:56:19 +0100 Original-Received: from localhost ([127.0.0.1]:53858 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LZXu5-00037G-N2 for guile-devel@m.gmane.org; Tue, 17 Feb 2009 16:54:57 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LZXu1-00035d-7V for guile-devel@gnu.org; Tue, 17 Feb 2009 16:54:53 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LZXtz-00035R-QZ for guile-devel@gnu.org; Tue, 17 Feb 2009 16:54:51 -0500 Original-Received: from [199.232.76.173] (port=39152 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LZXtz-00035O-L5 for guile-devel@gnu.org; Tue, 17 Feb 2009 16:54:51 -0500 Original-Received: from main.gmane.org ([80.91.229.2]:35445 helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1LZXtz-00028Z-3U for guile-devel@gnu.org; Tue, 17 Feb 2009 16:54:51 -0500 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1LZXtu-0008Rt-Rk for guile-devel@gnu.org; Tue, 17 Feb 2009 21:54:46 +0000 Original-Received: from reverse-83.fdn.fr ([80.67.176.83]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 17 Feb 2009 21:54:46 +0000 Original-Received: from ludo by reverse-83.fdn.fr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 17 Feb 2009 21:54:46 +0000 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 46 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: reverse-83.fdn.fr X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 29 =?iso-8859-1?Q?Pluvi=F4se?= an 217 de la =?iso-8859-1?Q?R=E9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 821D 815D 902A 7EAB 5CEE D120 7FBA 3D4F EB1F 5364 X-OS: i686-pc-linux-gnu User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.90 (gnu/linux) Cancel-Lock: sha1:83QbbXs9DyHWlXOyk8VZYaW8qpA= X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:8173 Archived-At: Hello! Mike Gran writes: > 1. To move to a Unicode-enabled guile, text information needs to be > converted to an internal representation when read and converted > back to the locale when written. Most reading and writing for > ports passes through scm_getc (input) and scm_lfwrite (output). > Conversion between locale strings and internal strings should > happen there. One strategy could be to have a new C port API, e.g., roughly based on R6RS', with transcoders and all, and somehow arrange to have the current port "API" mapped to that new shiny API. It might be a bit ambitious, though. > This implies that a source code file should have syntax to > indicate its own encoding, if it is not ASCII. Something akin to > the line in HTML files. One could imagine special treatment of, say, the first 10 lines of a file, with the ability to recognize Emacs file variables like "-*- coding: utf-8 -*-" and to change the current port transcoder accordingly, something like that. By default, which encoding is used by `read' would be determined by the input port's encoder. > 3. The text encoding of a port needs to be associated with the port. > R6RS has the idea of transcoders for ports that require > conversion. It is daunting, but, having played some ideas for a > few weeks, it seems that at least a subset of the transcoder > functionality needs to be implemented for this to make any sense. Yes. > I sent in my copyright assignment last week, so you should have it > now. Cool! IIRC, the first step you suggested was the implementation of wide string/char types. Did you also work on this? Thanks, Ludo'.