From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.bugs Subject: Re: Please clarify docs for open-file procedure (in trunk) Date: Thu, 18 Aug 2011 11:14:48 +0200 Message-ID: <877h6bhyif.fsf@pobox.com> References: <87bovxjhx8.fsf@goof.bjgalaxy> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1313669260 4482 80.91.229.12 (18 Aug 2011 12:07:40 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 18 Aug 2011 12:07:40 +0000 (UTC) Cc: bug-guile@gnu.org To: b3timmons@speedymail.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Thu Aug 18 14:07:35 2011 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Qu1Nl-00082V-Ng for guile-bugs@m.gmane.org; Thu, 18 Aug 2011 14:07:34 +0200 Original-Received: from localhost ([::1]:41127 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qu1Nl-0004RU-4L for guile-bugs@m.gmane.org; Thu, 18 Aug 2011 08:07:33 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:56540) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qu1Nh-0004Nm-Id for bug-guile@gnu.org; Thu, 18 Aug 2011 08:07:30 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qu1Nf-0000XO-V6 for bug-guile@gnu.org; Thu, 18 Aug 2011 08:07:29 -0400 Original-Received: from a-pb-sasl-sd.pobox.com ([74.115.168.62]:41413 helo=sasl.smtp.pobox.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qu1Nf-0000XK-T1 for bug-guile@gnu.org; Thu, 18 Aug 2011 08:07:27 -0400 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTP id C00736408; Thu, 18 Aug 2011 08:07:27 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=cotcLiyJr9DRY+G2FvHsXaPbO00=; b=fxZ3Lk Q19Uj4u313kFAGj4g1QbkKVvIABTkihlWqULKrYzgjilPfknh/8NuZOk2GoeHqbL qXWCH+dYdXrpFrd59ySNtEI3cWzejv4jbZ5H6kE9wGw2CbZ1K70eQvEBWgRibQuY FrnonAr4KUbxa6U7bRGincuu5qXXW2hxT6GvU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=eKSwyswRM1jvBFCJa0IV6RS8ObDqlYCo 5j/9L1Y9jJENyCa6jPSdv/2EFmg+PRTwPj8fXB2RfmLs/jDdGFfyyTYFSypIx7R3 g5nKnzHifAsU6rgMkYfVVppJTAZKfQSYYgxO5DPf8FL1pHW1co44FCKsiPsR4tz0 5Ij8i79iFv0= Original-Received: from a-pb-sasl-sd.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTP id B84AA6407; Thu, 18 Aug 2011 08:07:27 -0400 (EDT) Original-Received: from badger (unknown [90.164.198.39]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTPSA id 02C2F6406; Thu, 18 Aug 2011 08:07:26 -0400 (EDT) In-Reply-To: <87bovxjhx8.fsf@goof.bjgalaxy> (b3timmons@speedymail.org's message of "Wed, 10 Aug 2011 13:27:31 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) X-Pobox-Relay-ID: A3CBC424-C992-11E0-8BDC-B797DE995924-02397024!a-pb-sasl-sd.pobox.com X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-Received-From: 74.115.168.62 X-BeenThere: bug-guile@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:5784 Archived-At: Hi Bake, On Wed 10 Aug 2011 19:27, b3timmons@speedymail.org writes: > I think the documentation (in trunk) for the open-file procedure (in > file doc/ref/api-io.texi) needs clarification, especially for newbies to > encoding issues such as myself. Thanks for the report. I have rewritten it a bit, following your suggestions. The changeset is below. Cheers, Andy commit 5261e74281b1150e3b2594c92e571d8887a4900d Author: Andy Wingo Date: Thu Aug 18 11:13:34 2011 +0200 reword open-file docs * doc/ref/api-io.texi (File Ports): Refactor open-file docs. Thanks to Bake Timmons for the report. diff --git a/doc/ref/api-io.texi b/doc/ref/api-io.texi index 19c0665..afcde57 100644 --- a/doc/ref/api-io.texi +++ b/doc/ref/api-io.texi @@ -838,34 +838,34 @@ setvbuf} Add line-buffering to the port. The port output buffer will be automatically flushed whenever a newline character is written. @item b -Use binary mode. On DOS systems the default text mode converts CR+LF -in the file to newline for the program, whereas binary mode reads and -writes all bytes unchanged. On Unix-like systems there is no such -distinction, text files already contain just newlines and no -conversion is ever made. The @code{b} flag is accepted on all -systems, but has no effect on Unix-like systems. - -(For reference, Guile leaves text versus binary up to the C library, -@code{b} here just adds @code{O_BINARY} to the underlying @code{open} -call, when that flag is available.) - -Also, open the file using the 8-bit character encoding "ISO-8859-1", -ignoring any coding declaration or port encoding. - -Note that, when reading or writing binary data with ports, the -bytevector ports in the @code{(rnrs io ports)} module are preferred, -as they return vectors, and not strings (@pxref{R6RS I/O Ports}). +Use binary mode, ensuring that each byte in the file will be read as one +Scheme character. + +To provide this property, the file will be opened with the 8-bit +character encoding "ISO-8859-1", ignoring any coding declaration or port +encoding. @xref{Ports}, for more information on port encodings. + +Note that while it is possible to read and write binary data as +characters or strings, it is usually better to treat bytes as octets, +and byte sequences as bytevectors. @xref{R6RS Binary Input}, and +@ref{R6RS Binary Output}, for more. + +This option had another historical meaning, for DOS compatibility: in +the default (textual) mode, DOS reads a CR-LF sequence as one LF byte. +The @code{b} flag prevents this from happening, adding @code{O_BINARY} +to the underlying @code{open} call. Still, the flag is generally useful +because of its port encoding ramifications. @end table If a file cannot be opened with the access requested, @code{open-file} throws an exception. When the file is opened, this procedure will scan for a coding -declaration (@pxref{Character Encoding of Source Files}). If present -will use that encoding for interpreting the file. Otherwise, the -port's encoding will be used. To suppress this behavior, open -the file in binary mode and then set the port encoding explicitly -using @code{set-port-encoding!}. +declaration (@pxref{Character Encoding of Source Files}). If a coding +declaration is found, it will be used to interpret the file. Otherwise, +the port's encoding will be used. To suppress this behavior, open the +file in binary mode and then set the port encoding explicitly using +@code{set-port-encoding!}. In theory we could create read/write ports which were buffered in one direction only. However this isn't included in the -- http://wingolog.org/