From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mike Gran Newsgroups: gmane.lisp.guile.devel Subject: Re: Unicode, ports and encoding Date: Tue, 17 Feb 2009 15:45:32 -0800 (PST) Message-ID: <559772.471.qm@web37903.mail.mud.yahoo.com> References: <550226.89448.qm@web37908.mail.mud.yahoo.com> <87ocx0hgpv.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1234914352 22811 80.91.229.12 (17 Feb 2009 23:45:52 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 17 Feb 2009 23:45:52 +0000 (UTC) To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Feb 18 00:47:07 2009 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1LZZec-0004VG-1N for guile-devel@m.gmane.org; Wed, 18 Feb 2009 00:47:06 +0100 Original-Received: from localhost ([127.0.0.1]:40859 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LZZdH-00034A-WD for guile-devel@m.gmane.org; Tue, 17 Feb 2009 18:45:44 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LZZdG-000341-2d for guile-devel@gnu.org; Tue, 17 Feb 2009 18:45:42 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LZZdE-00033k-V4 for guile-devel@gnu.org; Tue, 17 Feb 2009 18:45:41 -0500 Original-Received: from [199.232.76.173] (port=58168 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LZZdE-00033f-Qh for guile-devel@gnu.org; Tue, 17 Feb 2009 18:45:40 -0500 Original-Received: from web37903.mail.mud.yahoo.com ([209.191.91.165]:49017) by monty-python.gnu.org with smtp (Exim 4.60) (envelope-from ) id 1LZZdE-0003de-F7 for guile-devel@gnu.org; Tue, 17 Feb 2009 18:45:40 -0500 Original-Received: (qmail 51536 invoked by uid 60001); 17 Feb 2009 23:45:32 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=EVDxFN39pB8Ff9DXTPM49BmSm0OjmlnGgihtTY+NOcpbuaWl6W9C7CR+yDkOfTb2sJTncRa4RkESHs6CK5X81oYMNcAHYL3a+vLm7dz4GA33sdfxPWdXXtCs6MJj5mXXPD3MrH/zgITZ8fKUhPgOVzpF/ewkT3WDiWAGvLXvYSQ=; X-YMail-OSG: 67apZCAVM1mjVPekQ.JMhaDbpOUgypHtypzfGMVU.o.195A9EOa2wtY3A0MK92Xm8u8bagRoaWpBcXYo143gUgjuek1MCBTnO._jXm344sYfbU5E.xznpWatwILY.h.W9nhPM06zZKoaN2H3fjtPsO40vy13fdlIf.VF3JBRJ2NNeEYLWexUemqeQkd5l6vkLOsJmakN75O_7m6ce0SwQr_SLg-- Original-Received: from [71.140.101.245] by web37903.mail.mud.yahoo.com via HTTP; Tue, 17 Feb 2009 15:45:32 PST X-Mailer: YahooMailRC/1156.82 YahooMailWebService/0.7.260.1 X-detected-operating-system: by monty-python.gnu.org: FreeBSD 6.x (1) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:8174 Archived-At: > From: Ludovic Court=E8s =0A>> Mike Gran writes:=0A=0A> > = This implies that a source code file should have syntax to=0A> > ind= icate its own encoding, if it is not ASCII. Something akin to=0A> > th= e line in HTML files.=0A> =0A> One could imagine special treatment of, say= , the first 10 lines of a=0A> file, with the ability to recognize Emacs fil= e variables like=0A> "-*- coding: utf-8 -*-" and to change the current port= transcoder=0A> accordingly, something like that.=0A=0AYeah. Something lik= e that.=0A=0A> IIRC, the first step you suggested was the implementation of= wide=0A> string/char types. Did you also work on this?=0A=0ASort of.=0A= =0AI thought I could start there, but, it isn't easy. There is a lot that c= ould=0Abe broken by modifying string processing. So I tried writing some t= ests =0Afirst so I can check my work as I go along. But the tests have to = be=0Anon-ASCII, so they need to be converted when they are read in.=0AIt ge= ts a little bit circular using scm_from_locale_string to convert=0Anon-ASCI= I strings in the test source, and then having the test check=0Athe behavior= of scm_from_locale_string.=0A=0ASo, now I think a better route is to make = some type of simplified=0Atranscoded port system available to ports so that= non-ASCII=0Atests are read in correctly. From there, one can work up tow= ard wide=0Astrings and chars while checking work along the way.=0A=0AThanks= ,=0A=0AMike Gran