From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.devel Subject: Adding new information to scm_t_port (was Re: always O_BINARY?) Date: Wed, 27 Feb 2013 22:24:19 -0500 Message-ID: <877gltxgrg.fsf_-_@tines.lan> References: <87vc9ij5z0.fsf@pobox.com> <87fw0l2yyk.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1362021888 24105 80.91.229.3 (28 Feb 2013 03:24:48 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 28 Feb 2013 03:24:48 +0000 (UTC) Cc: guile-devel@gnu.org To: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Thu Feb 28 04:25:11 2013 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UAu7D-0002n3-Nl for guile-devel@m.gmane.org; Thu, 28 Feb 2013 04:25:03 +0100 Original-Received: from localhost ([::1]:54761 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UAu6s-0001eu-Pq for guile-devel@m.gmane.org; Wed, 27 Feb 2013 22:24:42 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:44974) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UAu6l-0001aX-Rw for guile-devel@gnu.org; Wed, 27 Feb 2013 22:24:37 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UAu6k-0002gf-Ob for guile-devel@gnu.org; Wed, 27 Feb 2013 22:24:35 -0500 Original-Received: from world.peace.net ([96.39.62.75]:37887) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UAu6k-0002gU-Kg; Wed, 27 Feb 2013 22:24:34 -0500 Original-Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=tines.lan) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1UAu6d-0008TO-EJ; Wed, 27 Feb 2013 22:24:27 -0500 In-Reply-To: <87fw0l2yyk.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Sun, 24 Feb 2013 22:17:55 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 96.39.62.75 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:15834 Archived-At: ludo@gnu.org (Ludovic Court=C3=A8s) writes: > Andy Wingo skribis: > >> The (newline) function can write CRLF >> The ~% format directive should DTRT >> read-line should DTRT > > IMO the correct abstraction here is transcoders =C3=A0 la R6RS. Agreed. > The problem is that scm_t_port doesn=E2=80=99t have any slot to specify t= he > EOL style, but it would need one. I think it's important that we find a way to add new information to scm_t_port in 2.0. We also need this to properly fix the BOM issue. Here's a proposal: let's slightly redefine the meaning of 'input_cd' and 'output_cd'. Users are already unable to use these, because in the common case (UTF-8) they are both -1. Instead of having 'input_cd' and 'output_cd' point directly to the platform's iconv_t structures, let's have them point to our own internal structure(s) that hold the needed transcoder state. This could include things like the state for internally-implement encoding(s) (e.g. UTF-8 BOM handling), EOL style, and iconv_t pointer(s) if appropriate. What do you think? Mark