* bug#20200: GUILE 2.0.11: open-bytevector-input-port fails to open in binary mode
@ 2015-03-25 14:31 David Kastrup
2015-03-26 22:57 ` Mark H Weaver
0 siblings, 1 reply; 3+ messages in thread
From: David Kastrup @ 2015-03-25 14:31 UTC (permalink / raw)
To: 20200
[-- Attachment #1: Type: text/plain, Size: 53 bytes --]
Run the following code in an UTF-8 capable locale:
[-- Attachment #2: bad.scm --]
[-- Type: text/plain, Size: 555 bytes --]
(setlocale LC_ALL "")
(use-modules (rnrs io ports) (rnrs bytevectors) (ice-9 format))
(let ((p (open-bytevector-input-port
(u8-list->bytevector '(#xc3 #x9f #xc3 #X9f)))))
(format #t "~a ~a\n" (port-encoding p) (binary-port? p))
(format #t "#x~x\n" (char->integer (read-char p)))
(format #t "~a ~a\n" (port-encoding p) (binary-port? p))
(set-port-encoding! p "ISO-8859-1")
(format #t "~a ~a\n" (port-encoding p) (binary-port? p))
(format #t "#x~x\n" (char->integer (read-char p)))
(format #t "~a ~a\n" (port-encoding p) (binary-port? p)))
[-- Attachment #3: Type: text/plain, Size: 2092 bytes --]
This results in the output
#f #t
#xdf
#f #t
ISO-8859-1 #f
#xc3
ISO-8859-1 #f
The manual, however, states:
-- Scheme Procedure: port-encoding port
-- C Function: scm_port_encoding (port)
Returns, as a string, the character encoding that PORT uses to
interpret its input and output. The value ‘#f’ is equivalent to
‘"ISO-8859-1"’.
That would appear to be false since the value #f here is treated as
equivalent to "UTF-8" rather than "ISO-8859-1".
In addition, the manual states
-- Scheme Procedure: binary-port? port
Return ‘#t’ if PORT is a "binary port", suitable for binary data
input/output.
Note that internally Guile does not differentiate between binary
and textual ports, unlike the R6RS. Thus, this procedure returns
true when PORT does not have an associated encoding—i.e., when
‘(port-encoding PORT)’ is ‘#f’ (*note port-encoding: Ports.). This
is the case for ports returned by R6RS procedures such as
‘open-bytevector-input-port’ and ‘make-custom-binary-output-port’.
However, Guile currently does not prevent use of textual I/O
procedures such as ‘display’ or ‘read-char’ with binary ports.
Doing so “upgrades” the port from binary to textual, under the
ISO-8859-1 encoding. Likewise, Guile does not prevent use of
‘set-port-encoding!’ on a binary port, which also turns it into a
“textual” port.
But it would appear that the only way to actually get binary-encoded
read-char behavior is to switch the port to textual. While the port is
in "binary" mode, it will decode as utf-8 rather than deliver binary
data. Also it will not automagically switch itself away from the
nominal #f encoding which is not actually present.
Putting (with-fluids ((%default-port-encoding #f)) ...) around the
open-bytevector-input-port call results in the output
#f #t
#xc3
ISO-8859-1 #f
ISO-8859-1 #f
#x9f
ISO-8859-1 #f
which actually corresponds to the documentation.
--
David Kastrup
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#20200: GUILE 2.0.11: open-bytevector-input-port fails to open in binary mode
2015-03-25 14:31 bug#20200: GUILE 2.0.11: open-bytevector-input-port fails to open in binary mode David Kastrup
@ 2015-03-26 22:57 ` Mark H Weaver
2015-03-28 20:13 ` Mark H Weaver
0 siblings, 1 reply; 3+ messages in thread
From: Mark H Weaver @ 2015-03-26 22:57 UTC (permalink / raw)
To: David Kastrup; +Cc: 20200
David Kastrup <dak@gnu.org> writes:
> Run the following code in an UTF-8 capable locale:
>
> (setlocale LC_ALL "")
> (use-modules (rnrs io ports) (rnrs bytevectors) (ice-9 format))
> (let ((p (open-bytevector-input-port
> (u8-list->bytevector '(#xc3 #x9f #xc3 #X9f)))))
> (format #t "~a ~a\n" (port-encoding p) (binary-port? p))
> (format #t "#x~x\n" (char->integer (read-char p)))
> (format #t "~a ~a\n" (port-encoding p) (binary-port? p))
> (set-port-encoding! p "ISO-8859-1")
> (format #t "~a ~a\n" (port-encoding p) (binary-port? p))
> (format #t "#x~x\n" (char->integer (read-char p)))
> (format #t "~a ~a\n" (port-encoding p) (binary-port? p)))
>
> This results in the output
> #f #t
> #xdf
> #f #t
> ISO-8859-1 #f
> #xc3
> ISO-8859-1 #f
>
> The manual, however, states:
>
> -- Scheme Procedure: port-encoding port
> -- C Function: scm_port_encoding (port)
> Returns, as a string, the character encoding that PORT uses to
> interpret its input and output. The value ‘#f’ is equivalent to
> ‘"ISO-8859-1"’.
>
> That would appear to be false since the value #f here is treated as
> equivalent to "UTF-8" rather than "ISO-8859-1".
This is indeed a bug, introduced in Guile 2.0.9. The workaround is to
explicitly set the encoding to "ISO-8859-1".
Mark
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#20200: GUILE 2.0.11: open-bytevector-input-port fails to open in binary mode
2015-03-26 22:57 ` Mark H Weaver
@ 2015-03-28 20:13 ` Mark H Weaver
0 siblings, 0 replies; 3+ messages in thread
From: Mark H Weaver @ 2015-03-28 20:13 UTC (permalink / raw)
To: David Kastrup; +Cc: 20200-done
Fixed in d574d96f879c147c6c14df43f2e4ff9e8a6876b9, which will be in
Guile 2.0.12. I'm closing this bug now.
Thanks,
Mark
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-03-28 20:13 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-25 14:31 bug#20200: GUILE 2.0.11: open-bytevector-input-port fails to open in binary mode David Kastrup
2015-03-26 22:57 ` Mark H Weaver
2015-03-28 20:13 ` Mark H Weaver
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).