From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Newsgroups: gmane.lisp.guile.bugs Subject: bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports Date: Thu, 11 Jan 2018 15:34:17 +0100 Message-ID: <87o9m08nx2.fsf@gnu.org> References: <87zi5lrc3x.fsf@gnu.org> <87tvvtr9ge.fsf@gnu.org> <87fu7dptdn.fsf@igalia.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: blaine.gmane.org 1515681300 5781 195.159.176.226 (11 Jan 2018 14:35:00 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 11 Jan 2018 14:35:00 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) Cc: 30066@debbugs.gnu.org To: Andy Wingo Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Thu Jan 11 15:34:56 2018 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZdwW-0001E6-8J for guile-bugs@m.gmane.org; Thu, 11 Jan 2018 15:34:56 +0100 Original-Received: from localhost ([::1]:47674 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZdyV-0004Q5-Pj for guile-bugs@m.gmane.org; Thu, 11 Jan 2018 09:36:59 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58936) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZdwg-0002ub-Gj for bug-guile@gnu.org; Thu, 11 Jan 2018 09:35:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eZdwc-0004k6-HK for bug-guile@gnu.org; Thu, 11 Jan 2018 09:35:06 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:43880) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eZdwc-0004jj-AV for bug-guile@gnu.org; Thu, 11 Jan 2018 09:35:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1eZdwc-0007k7-3S for bug-guile@gnu.org; Thu, 11 Jan 2018 09:35:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Thu, 11 Jan 2018 14:35:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 30066 X-GNU-PR-Package: guile X-GNU-PR-Keywords: Original-Received: via spool by 30066-submit@debbugs.gnu.org id=B30066.151568126329711 (code B ref 30066); Thu, 11 Jan 2018 14:35:02 +0000 Original-Received: (at 30066) by debbugs.gnu.org; 11 Jan 2018 14:34:23 +0000 Original-Received: from localhost ([127.0.0.1]:51777 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZdvy-0007j9-IT for submit@debbugs.gnu.org; Thu, 11 Jan 2018 09:34:22 -0500 Original-Received: from hera.aquilenet.fr ([185.233.100.1]:32812) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZdvw-0007j0-IR for 30066@debbugs.gnu.org; Thu, 11 Jan 2018 09:34:21 -0500 Original-Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id 8839F103AB; Thu, 11 Jan 2018 15:34:19 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at aquilenet.fr Original-Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id szhPfqFjpm8o; Thu, 11 Jan 2018 15:34:18 +0100 (CET) Original-Received: from ribbon (unknown [193.50.110.92]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 3E126101D9; Thu, 11 Jan 2018 15:34:18 +0100 (CET) X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 22 =?UTF-8?Q?Niv=C3=B4se?= an 226 de la =?UTF-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu In-Reply-To: <87fu7dptdn.fsf@igalia.com> (Andy Wingo's message of "Wed, 10 Jan 2018 17:32:04 +0100") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: "bug-guile" Xref: news.gmane.org gmane.lisp.guile.bugs:8958 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello, Andy Wingo skribis: > There are tabs in your code; would you mind doing only spaces? > > A port being unbuffered doesn't mean that it has no bytes in its > buffer. In particular, scm_unget_bytes may put bytes back into the > buffer. Or, peek-u8 might fill this buffer with one byte. > > Also, they port may have buffered write bytes (could be the port has > write buffering but no read buffering). In that case (pt->rw_random) > you need to scm_flush(). > > I suggest taking the buffered bytes from the read buffer, if any. Then > if the port is unbuffered, make a bytevector and call scm_i_read_bytes; > otherwise do the scm_fill_input path that's there already. > > One more thing, if the port goes EOF, you need to > scm_port_buffer_set_has_eof_p. I think the attached patch addresses these issues. WDYT? Thanks for the review! Ludo=E2=80=99. --=-=-= Content-Type: text/x-patch; charset=utf-8 Content-Disposition: inline; filename=0001-get-bytevector-some-reads-as-much-as-possible-withou.patch Content-Transfer-Encoding: quoted-printable Content-Description: the patch >From d3a60bac6c6aae62ced6eec21b3865caaab83bb8 Mon Sep 17 00:00:00 2001 From: =3D?UTF-8?q?Ludovic=3D20Court=3DC3=3DA8s?=3D Date: Thu, 11 Jan 2018 15:29:55 +0100 Subject: [PATCH] 'get-bytevector-some' reads as much as possible without blocking. Fixes . * libguile/ports.c (scm_i_read_bytes): Remove 'static' keyword. * libguile/ports.h (SCM_UNBUFFEREDP): New macro. (scm_i_read_bytes): New declaration. * libguile/r6rs-ports.c (scm_get_bytevector_some): When PORT is unbuffered, invoke 'scm_i_read_bytes' to read as much as we can. * test-suite/tests/r6rs-ports.test ("8.2.8 Binary Input") ["get-bytevector-some [unbuffered port]"] ["get-bytevector-some [unbuffered port, lookahead-u8]"] ["get-bytevector-some [unbuffered port, unget-bytevector]"]: New tests. --- libguile/ports.c | 6 ++++-- libguile/ports.h | 7 ++++++- libguile/r6rs-ports.c | 34 ++++++++++++++++++++++++++++------ test-suite/tests/r6rs-ports.test | 36 ++++++++++++++++++++++++++++++++++-- 4 files changed, 72 insertions(+), 11 deletions(-) diff --git a/libguile/ports.c b/libguile/ports.c index 72bb73a01..002dd1433 100644 --- a/libguile/ports.c +++ b/libguile/ports.c @@ -1,4 +1,4 @@ -/* Copyright (C) 1995-2001, 2003-2004, 2006-2017 +/* Copyright (C) 1995-2001, 2003-2004, 2006-2018 * Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or @@ -1543,7 +1543,9 @@ scm_peek_byte_or_eof (SCM port) return peek_byte_or_eof (port, &buf, &cur); } =20 -static size_t +/* Like read(2), read *up to* COUNT bytes from PORT into DST, starting + at OFFSET. Return 0 upon EOF. */ +size_t scm_i_read_bytes (SCM port, SCM dst, size_t start, size_t count) { size_t filled; diff --git a/libguile/ports.h b/libguile/ports.h index d131db5be..3fe64c27d 100644 --- a/libguile/ports.h +++ b/libguile/ports.h @@ -4,7 +4,7 @@ #define SCM_PORTS_H =20 /* Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2003, 2004, - * 2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014 Free Software Foundati= on, Inc. + * 2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2018 Free Software Fo= undation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -71,7 +71,10 @@ SCM_INTERNAL SCM scm_i_port_weak_set; #define SCM_CLOSEDP(x) (!SCM_OPENP (x)) #define SCM_CLR_PORT_OPEN_FLAG(p) \ SCM_SET_CELL_WORD_0 ((p), SCM_CELL_WORD_0 (p) & ~SCM_OPN) + #ifdef BUILDING_LIBGUILE +#define SCM_UNBUFFEREDP(x) \ + (SCM_PORTP (x) && (SCM_CELL_WORD_0 (x) & SCM_BUF0)) #define SCM_PORT_FINALIZING_P(x) \ (SCM_CELL_WORD_0 (x) & SCM_F_PORT_FINALIZING) #define SCM_SET_PORT_FINALIZING(p) \ @@ -185,6 +188,8 @@ SCM_API int scm_get_byte_or_eof (SCM port); SCM_API int scm_peek_byte_or_eof (SCM port); SCM_API size_t scm_c_read (SCM port, void *buffer, size_t size); SCM_API size_t scm_c_read_bytes (SCM port, SCM dst, size_t start, size_t c= ount); +SCM_INTERNAL size_t scm_i_read_bytes (SCM port, SCM dst, size_t start, + size_t count); SCM_API scm_t_wchar scm_getc (SCM port); SCM_API SCM scm_read_char (SCM port); =20 diff --git a/libguile/r6rs-ports.c b/libguile/r6rs-ports.c index e944c7aab..a3d638ca0 100644 --- a/libguile/r6rs-ports.c +++ b/libguile/r6rs-ports.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2009, 2010, 2011, 2013-2015 Free Software Foundation, Inc. +/* Copyright (C) 2009, 2010, 2011, 2013-2015, 2018 Free Software Foundatio= n, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -481,9 +481,9 @@ SCM_DEFINE (scm_get_bytevector_some, "get-bytevector-so= me", 1, 0, 0, "position to point just past these bytes.") #define FUNC_NAME s_scm_get_bytevector_some { - SCM buf; + SCM buf, bv; size_t cur, avail; - SCM bv; + const size_t max_buffer_size =3D 4096; =20 SCM_VALIDATE_BINARY_INPUT_PORT (1, port); =20 @@ -494,9 +494,31 @@ SCM_DEFINE (scm_get_bytevector_some, "get-bytevector-s= ome", 1, 0, 0, return SCM_EOF_VAL; } =20 - bv =3D scm_c_make_bytevector (avail); - scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv), - avail, cur, avail); + if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size)) + { + /* PORT is unbuffered. Read as much as possible from PORT. */ + size_t read; + + bv =3D scm_c_make_bytevector (max_buffer_size); + scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (= bv), + avail, cur, avail); + + read =3D scm_i_read_bytes (port, bv, avail, + SCM_BYTEVECTOR_LENGTH (bv) - avail); + + if (read =3D=3D 0) + scm_port_buffer_set_has_eof_p (buf, SCM_BOOL_F); + + if (read + avail < SCM_BYTEVECTOR_LENGTH (bv)) + bv =3D scm_c_shrink_bytevector (bv, read + avail); + } + else + { + /* Return what's already buffered. */ + bv =3D scm_c_make_bytevector (avail); + scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (= bv), + avail, cur, avail); + } =20 return bv; } diff --git a/test-suite/tests/r6rs-ports.test b/test-suite/tests/r6rs-ports= .test index ba3131f2e..d5476f20e 100644 --- a/test-suite/tests/r6rs-ports.test +++ b/test-suite/tests/r6rs-ports.test @@ -1,6 +1,6 @@ ;;;; r6rs-ports.test --- R6RS I/O port tests. -*- coding: utf-8; -*- ;;;; -;;;; Copyright (C) 2009-2012, 2013-2015 Free Software Foundation, Inc. +;;;; Copyright (C) 2009-2012, 2013-2015, 2018 Free Software Foundation, In= c. ;;;; Ludovic Court=C3=A8s ;;;; ;;;; This library is free software; you can redistribute it and/or @@ -26,7 +26,8 @@ #:use-module (rnrs io ports) #:use-module (rnrs io simple) #:use-module (rnrs exceptions) - #:use-module (rnrs bytevectors)) + #:use-module (rnrs bytevectors) + #:use-module ((ice-9 binary-ports) #:select (unget-bytevector))) =20 (define-syntax pass-if-condition (syntax-rules () @@ -183,6 +184,37 @@ (equal? (bytevector->u8-list bv) (map char->integer (string->list str)))))) =20 + (pass-if-equal "get-bytevector-some [unbuffered port]" + (string->utf8 "Hello, world!") + ;; 'get-bytevector-some' used to return a single byte, see + ;; . + (call-with-input-string "Hello, world!" + (lambda (port) + (setvbuf port _IONBF) + (get-bytevector-some port)))) + + (pass-if-equal "get-bytevector-some [unbuffered port, lookahead-u8]" + (string->utf8 "Hello, world!") + (call-with-input-string "Hello, world!" + (lambda (port) + (setvbuf port _IONBF) + + ;; 'lookahead-u8' fills in PORT's 1-byte buffer. Yet, + ;; 'get-bytevector-some' should return the whole thing. + (and (eqv? (lookahead-u8 port) (char->integer #\H)) + (get-bytevector-some port))))) + + (pass-if-equal "get-bytevector-some [unbuffered port, unget-bytevector]" + (string->utf8 "Hello") + (call-with-input-string "Hello, world!" + (lambda (port) + (setvbuf port _IONBF) + ;; 'unget-bytevector' fills the putback buffer, and + ;; 'get-bytevector-some' should get data from there. + (unget-bytevector port (get-bytevector-all port) + 0 5) + (get-bytevector-some port)))) + (pass-if "get-bytevector-all" (let* ((str "GNU Guile") (index 0) --=20 2.15.1 --=-=-=--