(r6rs io ports)

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

* (r6rs io ports)
@ 2010-04-04 19:24 Mike Gran
  2010-04-05 15:16 ` Andy Wingo
  2010-04-10 11:06 ` Ludovic Courtès
  0 siblings, 2 replies; 11+ messages in thread
From: Mike Gran @ 2010-04-04 19:24 UTC (permalink / raw)
  To: guile-devel

Hi-

I've been thinking about the (r6rs io ports) module.  It is a fairly big
undertaking, but, the challenges are similar to the Unicode update to
the current port system.

First, let me note the ways that r6rs ports and Guile legacy ports are
different:

1.  Encoding is a property of the port vs a property of a transcoder

    R6RS have transcoder objects that describe a conversion as well
    as the state of the conversion.  These can be attached to ports
    or removed from ports.

2.  Pushback vs lookahead

    Guile legacy ports implement an ungetc operation.  You can
    always push a character back onto a port, even if the underlying
    port doesn't support it.  It does this by implementing a
    pushback buffer for each port.  Characters that are 'ungotten'
    go into the pushback buffer, and the next getc checks the
    pushback buffer first.  The pushback code is rather complex.

    This behaviour is vital to operation of the legacy Guile parser,
    which uses 'ungetc' repeatedly.

    (I suppose this would allow Guile to parse code from a pipe,
    but, I've never tried that.)

    R6RS ports instead have the concept of 'lookahead' but no
    'ungetc'.  Characters can not be pushed back onto a port, but,
    one can lookahead to see what bytes or characters are next.

Anyway.  On to implementation...

There are 4 types of R6RS ports: file, bytevector, string, and custom.
Each port is either binary or textual, and not both at the same time.

So one possibility is the following....

1. Expose scheme functions that are a set of low level file operations
that bypass Guile legacy ports and scm_getc.

2. Build the binary R6RS port system in Scheme using these functions,
and the bytevector and string functions that already exist.

(There will be no pushback buffer or intermediate storage of port data.
If a file supports random access or rewind, the lookahead operation will
succeed.  Otherwise, it will fail.)

3. Create a transcoder object in C that holds encoding and conversion
strategies, similar to those that we have attached to Guile legacy
ports.  Add methods to the transcoder object to do conversion between
bytevectors and strings.

4. Build textual R6RS ports in Scheme on top of the binary R6RS port 
system and the transcoder object.

5. And, ultimately, create a 'pushback buffer' object that supports
an ungetc operation.  Rebuild Guile legacy ports as a combination 
of R6RS ports and a pushback buffer object.

-Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-04 19:24 (r6rs io ports) Mike Gran
@ 2010-04-05 15:16 ` Andy Wingo
  2010-04-10 11:06 ` Ludovic Courtès
  1 sibling, 0 replies; 11+ messages in thread
From: Andy Wingo @ 2010-04-05 15:16 UTC (permalink / raw)
  To: Mike Gran; +Cc: guile-devel

On Sun 04 Apr 2010 21:24, Mike Gran <spk121@yahoo.com> writes:

> I've been thinking about the (r6rs io ports) module.  It is a fairly big
> undertaking, but, the challenges are similar to the Unicode update to
> the current port system.

Thanks for the nice writeup, it was enlightening. The implementation
plan also sounds sensible to me, though I am too ignorant to criticise
specifics.

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-04 19:24 (r6rs io ports) Mike Gran
  2010-04-05 15:16 ` Andy Wingo
@ 2010-04-10 11:06 ` Ludovic Courtès
  2010-04-10 16:49   ` Mike Gran
  1 sibling, 1 reply; 11+ messages in thread
From: Ludovic Courtès @ 2010-04-10 11:06 UTC (permalink / raw)
  To: guile-devel

Hi Mike,

Mike Gran <spk121@yahoo.com> writes:

> First, let me note the ways that r6rs ports and Guile legacy ports are
> different:
>
> 1.  Encoding is a property of the port vs a property of a transcoder
>
>     R6RS have transcoder objects that describe a conversion as well
>     as the state of the conversion.

I wonder what the implications of having the conversion state in the
transcoder are.  For instance, what if the transcoder is shared among
several ports?  What if it’s passed from one port to another in the
middle of a conversion?

Section 8.2.4 of r6rs-lib says that transcoders are “possibly stateful”,
but it also says that they are immutable.  So I guess a possible
implementation is to have transcoders stateless and immutable, e.g.,

  (define (latin-1-codec)
    "ISO-8859-1")

  (define* (make-transcoder codec #:optional eol-style handling)
    (list coder eol-style handling))

If that is the case, implementing them on top of Guile’s ports should be
easier.  (The only thing is that there’s no distinction between binary
and textual Guile ports.)

>     These can be attached to ports or removed from ports.
>
> 2.  Pushback vs lookahead
>
>     Guile legacy ports implement an ungetc operation.  You can
>     always push a character back onto a port, even if the underlying
>     port doesn't support it.  It does this by implementing a
>     pushback buffer for each port.  Characters that are 'ungotten'
>     go into the pushback buffer, and the next getc checks the
>     pushback buffer first.  The pushback code is rather complex.
>
>     This behaviour is vital to operation of the legacy Guile parser,
>     which uses 'ungetc' repeatedly.
>
>     (I suppose this would allow Guile to parse code from a pipe,
>     but, I've never tried that.)
>
>     R6RS ports instead have the concept of 'lookahead' but no
>     'ungetc'.  Characters can not be pushed back onto a port, but,
>     one can lookahead to see what bytes or characters are next.

Guile has ‘scm_peek_char ()’ and ‘lookahead-u8’ is implemented in terms
of it in libguile/r6rs-ports.c.

> Anyway.  On to implementation...
>
> There are 4 types of R6RS ports: file, bytevector, string, and custom.
> Each port is either binary or textual, and not both at the same time.
>
> So one possibility is the following....
>
> 1. Expose scheme functions that are a set of low level file operations
> that bypass Guile legacy ports and scm_getc.
>
> 2. Build the binary R6RS port system in Scheme using these functions,
> and the bytevector and string functions that already exist.
>
> (There will be no pushback buffer or intermediate storage of port data.
> If a file supports random access or rewind, the lookahead operation will
> succeed.  Otherwise, it will fail.)
>
> 3. Create a transcoder object in C that holds encoding and conversion
> strategies, similar to those that we have attached to Guile legacy
> ports.  Add methods to the transcoder object to do conversion between
> bytevectors and strings.
>
> 4. Build textual R6RS ports in Scheme on top of the binary R6RS port 
> system and the transcoder object.
>
> 5. And, ultimately, create a 'pushback buffer' object that supports
> an ungetc operation.  Rebuild Guile legacy ports as a combination 
> of R6RS ports and a pushback buffer object.

I may well be missing something, but how about this hopefully simpler
strategy:

  1. Transcoders are (roughly) as simple as suggested above.

  2. In r6rs-ports.c, when a transcoder is passed, just
     scm_set_port_encoding_x, etc. the port.

  3. Implement EOL handling in Guile ports.

  4. See whether/how binary and textual ports can be differentiated in
     Guile ports.

  5. Have fun making sure the functions raise the right R6RS error
     conditions instead of ‘system-error’ et al.

How much sense does it make?  :-)

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-10 11:06 ` Ludovic Courtès
@ 2010-04-10 16:49   ` Mike Gran
  2010-04-10 18:14     ` Ludovic Courtès
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Gran @ 2010-04-10 16:49 UTC (permalink / raw)
  To: Ludovic Courtès, guile-devel

Hi-

> From: Ludovic Courtès ludo@gnu.org

> Section 8.2.4 of r6rs-lib says that transcoders are “possibly stateful”,
> but it also says that they are immutable.  So I guess a possible
> implementation is to have transcoders stateless and immutable, e.g.,

>  (define (latin-1-codec)
>    "ISO-8859-1")

>  (define* (make-transcoder codec #:optional eol-style handling)
>    (list coder eol-style handling))

> If that is the case, implementing them on top of Guile’s ports should be
> easier.  (The only thing is that there’s no distinction between binary
> and textual Guile ports.)

It would be easier.  When thinking about this, I was remembering or
mis-remembering that, back in the 2009, you'd said some along the
lines of ultimately standardizing on the R6RS ports codebase, and that
I was to consider the work on Guile legacy ports as interrim.

So, I suppose, all along, I've been thinking that ultimately we'd end
up with something like I suggested, with the code in r6rs-ports.c
being the source of the major port functionality.

If these days we like how the Guile legacy ports are performing and
want to build R6RS ports on them, that's comparitively easy.
In which case...

> I may well be missing something, but how about this hopefully simpler
> strategy:

>   1. Transcoders are (roughly) as simple as suggested above.

>   2. In r6rs-ports.c, when a transcoder is passed, just
>     scm_set_port_encoding_x, etc. the port.

>   3. Implement EOL handling in Guile ports.

>   4. See whether/how binary and textual ports can be differentiated in
>    Guile ports.

>   5. Have fun making sure the functions raise the right R6RS error
>    conditions instead of ‘system-error’ et al.

Works for me.  Some questions that will have to be answered.
Is there a C API for raising R6RS error conditions?  
Do we need to raise Guile legacy errors when accessing ports through
the legacy API and R6RS errors when accessing ports through the 
R6RS API?
What about R6RS buffering modes?

Thanks,

Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-10 16:49   ` Mike Gran
@ 2010-04-10 18:14     ` Ludovic Courtès
  2010-04-10 19:03       ` Mike Gran
  2010-04-11  3:34       ` Julian Graham
  0 siblings, 2 replies; 11+ messages in thread
From: Ludovic Courtès @ 2010-04-10 18:14 UTC (permalink / raw)
  To: Mike Gran; +Cc: guile-devel

Hi Mike,

Mike Gran <spk121@yahoo.com> writes:

> It would be easier.  When thinking about this, I was remembering or
> mis-remembering that, back in the 2009, you'd said some along the
> lines of ultimately standardizing on the R6RS ports codebase, and that
> I was to consider the work on Guile legacy ports as interrim.
>
> So, I suppose, all along, I've been thinking that ultimately we'd end
> up with something like I suggested, with the code in r6rs-ports.c
> being the source of the major port functionality.
>
> If these days we like how the Guile legacy ports are performing and
> want to build R6RS ports on them, that's comparitively easy.
> In which case...

Heh, good point.  I don’t like the current port API: it’s low-level,
it’s C, it’s undocumented, it forces users to access Guile internals,
etc.  But it’s widely used, in Guile and outside.  If (rnrs io ports)
were to be included in 2.0 (though I don’t think it should be a
showstopper), it would seem safer to choose a solution that is simple
and mostly orthogonal to the rest of Guile core.

Perhaps the move to a new port API (probably based on that of R6RS) can
be left for 2.2?  Hopefully, we’ll be much less relying on C by then,
which should make things easier.

What do you think?

>> I may well be missing something, but how about this hopefully simpler
>> strategy:
>
>>   1. Transcoders are (roughly) as simple as suggested above.
>
>>   2. In r6rs-ports.c, when a transcoder is passed, just
>>     scm_set_port_encoding_x, etc. the port.
>
>>   3. Implement EOL handling in Guile ports.
>
>>   4. See whether/how binary and textual ports can be differentiated in
>>    Guile ports.
>
>>   5. Have fun making sure the functions raise the right R6RS error
>>    conditions instead of ‘system-error’ et al.
>
> Works for me.  Some questions that will have to be answered.
> Is there a C API for raising R6RS error conditions? 

No, not yet.  Actually, Julian’s work on R6RS libraries isn’t merged
yet.

> Do we need to raise Guile legacy errors when accessing ports through
> the legacy API and R6RS errors when accessing ports through the 
> R6RS API?

Ideally, yes, but it may be hard or impossible.  Needs to be
investigated.

> What about R6RS buffering modes?

Something along these lines:

  (define-syntax buffer-mode
    (syntax-rules (none line block)
      ((_ none)  _IONBF)
      ((_ line)  _IOLBF)
      ((_ block) _IOFBF)))
      
  (define* (open-file-input-port filename
                                 #:optional options buffer-mode
                                            maybe-transcoder)
    (let ((f (open-input-file filename)))
      (setvbuf f buffer-mode)
      f))

(With disjoint types, exception conversion, etc. :-))

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-10 18:14     ` Ludovic Courtès
@ 2010-04-10 19:03       ` Mike Gran
  2010-04-10 19:45         ` Ludovic Courtès
  2010-04-11 21:38         ` Andy Wingo
  2010-04-11  3:34       ` Julian Graham
  1 sibling, 2 replies; 11+ messages in thread
From: Mike Gran @ 2010-04-10 19:03 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi



> From: Ludovic Courtès ludo@gnu.org

>> It would be easier.  When thinking about this, I was remembering or
>> mis-remembering that, back in the 2009, you'd said some along the
>> lines of ultimately standardizing on the R6RS ports codebase, and that
>> I was to consider the work on Guile legacy ports as interrim.

[...]

> Heh, good point.  I don’t like the current port API: it’s low-level,
> it’s C, it’s  undocumented, it forces users to access Guile internals,
> etc.  But it’s widely used, in Guile and outside.  If (rnrs io ports)
> were to be included in 2.0 (though I don’t think it should be a
> showstopper), it would seem safer to choose a solution that is simple
> and mostly orthogonal to the rest of Guile core.

> Perhaps the move to a new port API (probably based on that of R6RS) can
> be left for 2.2?  Hopefully, we’ll be much less relying on C by then,
> which should make things easier.

> What do you think?

I think that if you want to move to a new port codebase, there is no
need to add new features to the old one.

Personally, I have no pressing need for the R6RS ports in 2.0.  So I'd 
say it is better to push on with Doing The Right Thing, even if it 
(as we say in my business) "pushes the schedule to the right".

However, a counter argument might be that not having R6RS IO could be
a problem when marketing 2.0.

Thanks,

Mike




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-10 19:03       ` Mike Gran
@ 2010-04-10 19:45         ` Ludovic Courtès
  2010-04-11 23:01           ` Mike Gran
  2010-04-11 21:38         ` Andy Wingo
  1 sibling, 1 reply; 11+ messages in thread
From: Ludovic Courtès @ 2010-04-10 19:45 UTC (permalink / raw)
  To: Mike Gran; +Cc: guile-devel

Hi,

Mike Gran <spk121@yahoo.com> writes:

>> From: Ludovic Courtès ludo@gnu.org
>
>>> It would be easier.  When thinking about this, I was remembering or
>>> mis-remembering that, back in the 2009, you'd said some along the
>>> lines of ultimately standardizing on the R6RS ports codebase, and that
>>> I was to consider the work on Guile legacy ports as interrim.
>
> [...]
>
>> Heh, good point.  I don’t like the current port API: it’s low-level,
>> it’s C, it’s  undocumented, it forces users to access Guile internals,
>> etc.  But it’s widely used, in Guile and outside.  If (rnrs io ports)
>> were to be included in 2.0 (though I don’t think it should be a
>> showstopper), it would seem safer to choose a solution that is simple
>> and mostly orthogonal to the rest of Guile core.
>
>> Perhaps the move to a new port API (probably based on that of R6RS) can
>> be left for 2.2?  Hopefully, we’ll be much less relying on C by then,
>> which should make things easier.
>
>> What do you think?
>
> I think that if you want to move to a new port codebase, there is no
> need to add new features to the old one.

I think that if we can provide (rnrs io ports) with a reasonably small
effort, then it’s probably worth it (the only new feature would be EOL
handling, AFAICS).

And it’d be nice to have better coverage of this API, anyway, since it
already provides features not available in the native port API, such as
binary I/O.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-10 18:14     ` Ludovic Courtès
  2010-04-10 19:03       ` Mike Gran
@ 2010-04-11  3:34       ` Julian Graham
  2010-04-11 21:40         ` Andy Wingo
  1 sibling, 1 reply; 11+ messages in thread
From: Julian Graham @ 2010-04-11  3:34 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi Mike and Ludo,

>> Works for me.  Some questions that will have to be answered.
>> Is there a C API for raising R6RS error conditions?
>
> No, not yet.  Actually, Julian’s work on R6RS libraries isn’t merged
> yet.

FWIW, the way this works right now on the wip-r6rs-libraries branch is
that R6RS `raise' is implemented in terms of Guile's `throw' -- my
implementation just uses a wrapper record type that encapsulates the
original exception object for handling by R6RS exception handlers and
optionally stores a continuation in order to support
`raise-continuable'.  (I suppose I could move some of that code to C
if people thought it made sense.)

Regards,
Julian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-10 19:03       ` Mike Gran
  2010-04-10 19:45         ` Ludovic Courtès
@ 2010-04-11 21:38         ` Andy Wingo
  1 sibling, 0 replies; 11+ messages in thread
From: Andy Wingo @ 2010-04-11 21:38 UTC (permalink / raw)
  To: Mike Gran; +Cc: Ludovic Courtès, guile-devel

Hi!

On Sat 10 Apr 2010 21:03, Mike Gran <spk121@yahoo.com> writes:

> not having R6RS IO could be a problem when marketing 2.0.

I wouldn't worry about it. If it actually were a problem, 2.2 should
happen within a year.

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-11  3:34       ` Julian Graham
@ 2010-04-11 21:40         ` Andy Wingo
  0 siblings, 0 replies; 11+ messages in thread
From: Andy Wingo @ 2010-04-11 21:40 UTC (permalink / raw)
  To: Julian Graham; +Cc: Ludovic Courtès, guile-devel

Hi Julian,

On Sun 11 Apr 2010 05:34, Julian Graham <joolean@gmail.com> writes:

> R6RS `raise' is implemented in terms of Guile's `throw' -- my
> implementation just uses a wrapper record type that encapsulates the
> original exception object for handling by R6RS exception handlers and
> optionally stores a continuation in order to support
> `raise-continuable'. (I suppose I could move some of that code to C if
> people thought it made sense.)

No, that wouldn't make sense. You should be implementing raise and
raise-continuable in terms of call-with-prompt, abort-to-prompt, and
fluids, just as catch and throw are implemented. See boot-9.scm, or
"Prompts" in the manual, or
http://wingolog.org/archives/2010/02/14/sidelong-glimpses, or ask me if
you have questions :-)

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (r6rs io ports)
  2010-04-10 19:45         ` Ludovic Courtès
@ 2010-04-11 23:01           ` Mike Gran
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Gran @ 2010-04-11 23:01 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi-

> From: Ludovic Courtès ludo@gnu.org

> I think that if we can provide (rnrs io ports) 
> with a reasonably smalleffort, then it’s probably 
> worth it (the only new feature would be EOL
> handling, AFAICS).

I happen to have some R6RS EOL handling code 
lying around.  I'll try to push it this week.

- Mike





^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-04-11 23:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-04 19:24 (r6rs io ports) Mike Gran
2010-04-05 15:16 ` Andy Wingo
2010-04-10 11:06 ` Ludovic Courtès
2010-04-10 16:49   ` Mike Gran
2010-04-10 18:14     ` Ludovic Courtès
2010-04-10 19:03       ` Mike Gran
2010-04-10 19:45         ` Ludovic Courtès
2010-04-11 23:01           ` Mike Gran
2010-04-11 21:38         ` Andy Wingo
2010-04-11  3:34       ` Julian Graham
2010-04-11 21:40         ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).