unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: guile-devel@gnu.org
Subject: Re: Unicode, ports and encoding
Date: Tue, 17 Feb 2009 22:54:36 +0100	[thread overview]
Message-ID: <87ocx0hgpv.fsf@gnu.org> (raw)
In-Reply-To: 550226.89448.qm@web37908.mail.mud.yahoo.com

Hello!

Mike Gran <spk121@yahoo.com> writes:

> 1.  To move to a Unicode-enabled guile, text information needs to be
>     converted to an internal representation when read and converted
>     back to the locale when written.  Most reading and writing for
>     ports passes through scm_getc (input) and scm_lfwrite (output).
>     Conversion between locale strings and internal strings should
>     happen there.

One strategy could be to have a new C port API, e.g., roughly based on
R6RS', with transcoders and all, and somehow arrange to have the current
port "API" mapped to that new shiny API.  It might be a bit ambitious,
though.

>     This implies that a source code file should have syntax to
>     indicate its own encoding, if it is not ASCII.  Something akin to
>     the <?xml encoding="utf-8"?> line in HTML files.

One could imagine special treatment of, say, the first 10 lines of a
file, with the ability to recognize Emacs file variables like
"-*- coding: utf-8 -*-" and to change the current port transcoder
accordingly, something like that.

By default, which encoding is used by `read' would be determined by the
input port's encoder.

> 3.  The text encoding of a port needs to be associated with the port.
>     R6RS has the idea of transcoders for ports that require
>     conversion.  It is daunting, but, having played some ideas for a
>     few weeks, it seems that at least a subset of the transcoder
>     functionality needs to be implemented for this to make any sense.

Yes.

> I sent in my copyright assignment last week, so you should have it
> now.

Cool!

IIRC, the first step you suggested was the implementation of wide
string/char types.  Did you also work on this?

Thanks,
Ludo'.





  reply	other threads:[~2009-02-17 21:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-16 23:51 Unicode, ports and encoding Mike Gran
2009-02-17 21:54 ` Ludovic Courtès [this message]
2009-02-17 23:45   ` Mike Gran
2009-02-18  8:48     ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ocx0hgpv.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).