unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* Please clarify docs for open-file procedure (in trunk)
@ 2011-08-10 17:27 b3timmons
  2011-08-18  9:14 ` Andy Wingo
  0 siblings, 1 reply; 2+ messages in thread
From: b3timmons @ 2011-08-10 17:27 UTC (permalink / raw)
  To: bug-guile

Hi,

I think the documentation (in trunk) for the open-file procedure (in
file doc/ref/api-io.texi) needs clarification, especially for newbies to
encoding issues such as myself.

In particular, consider the description for the binary flag b:

----------------------------------------------------------------------
@item b
Use binary mode.  On DOS systems the default text mode converts CR+LF
in the file to newline for the program, whereas binary mode reads and
writes all bytes unchanged.  On Unix-like systems there is no such
distinction, text files already contain just newlines and no
conversion is ever made.  The @code{b} flag is accepted on all
systems, but has no effect on Unix-like systems.

(For reference, Guile leaves text versus binary up to the C library,
@code{b} here just adds @code{O_BINARY} to the underlying @code{open}
call, when that flag is available.)

Also, open the file using the 8-bit character encoding "ISO-8859-1",
ignoring any coding declaration or port encoding.
...
----------------------------------------------------------------------

I stopped reading here, thinking that the b flag "has no effect on" reading my
binary data.  Yet, as subsequently explained, it does indeed have an effect on
the encoding used to open the file.  How about something like:

----------------------------------------------------------------------
@item b
Use binary mode.  In general this might affect handling of line endings
and file encodings.

Regarding line endings, on DOS systems the default text mode converts
CR+LF in the file to newline for the program, whereas binary mode reads
and writes all bytes unchanged.  On Unix-like systems there is no such
distinction, text files already contain just newlines and no conversion
is ever made.  The @code{b} flag is accepted on all systems, but has no
effect on Unix-like systems.

(For reference, Guile leaves text versus binary up to the C library,
@code{b} here just adds @code{O_BINARY} to the underlying @code{open}
call, when that flag is available.)

Regarding file encodings, a file opened in binary mode uses the 8-bit
character encoding "ISO-8859-1", ignoring any coding declaration or port
encoding.
----------------------------------------------------------------------

A bit of redundancy like this might help newbies such as myself avoid a
misunderstanding here.

I should also point out a grammatical mistake further on:

----------------------------------------------------------------------
When the file is opened, this procedure will scan for a coding
declaration (@pxref{Character Encoding of Source Files}). If present
will use that encoding for interpreting the file.  Otherwise, the
port's encoding will be used.  To suppress this behavior, open
the file in binary mode and then set the port encoding explicitly
using @code{set-port-encoding!}.
----------------------------------------------------------------------

The paragraph contains in its middle the following fragment:
"If present will use that encoding for interpreting the file."

How about: "If it is found, the corresponding encoding will be used to
interpret the file." ?

Thanks,
Bake



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-08-18  9:14 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-10 17:27 Please clarify docs for open-file procedure (in trunk) b3timmons
2011-08-18  9:14 ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).