unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
@ 2018-01-10 15:02 Ludovic Courtès
  2018-01-10 15:59 ` Ludovic Courtès
  0 siblings, 1 reply; 15+ messages in thread
From: Ludovic Courtès @ 2018-01-10 15:02 UTC (permalink / raw)
  To: 30066; +Cc: Andy Wingo

As discussed on IRC, ‘get-bytevector-some’ returns only 1 byte from
unbuffered ports:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (call-with-input-string "foo"
		       (lambda (port)
			 (setvbuf port _IONBF)
			 (get-bytevector-some port)))
$11 = #vu8(102)
scheme@(guile-user)> (version)
$12 = "2.2.3"
--8<---------------cut here---------------end--------------->8---

Strictly speaking it’s valid, but in practice it’s not very useful.

AFAICS, we lack a way to do the equivalent of:

  read (fd, buf, sizeof buf);

‘get-bytevector-n!’ is different because it blocks until it has read
COUNT bytes or EOF is reached.  So ‘get-bytevector-some’ could play this
role, but it doesn’t.

Thoughts?

Ludo’.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-10 15:02 bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports Ludovic Courtès
@ 2018-01-10 15:59 ` Ludovic Courtès
  2018-01-10 16:32   ` Andy Wingo
  0 siblings, 1 reply; 15+ messages in thread
From: Ludovic Courtès @ 2018-01-10 15:59 UTC (permalink / raw)
  To: 30066; +Cc: Andy Wingo

[-- Attachment #1: Type: text/plain, Size: 190 bytes --]

ludo@gnu.org (Ludovic Courtès) skribis:

> As discussed on IRC, ‘get-bytevector-some’ returns only 1 byte from
> unbuffered ports:

Here’s a tentative fix.  WDYT?

Ludo’.


[-- Attachment #2: Type: text/x-patch, Size: 4923 bytes --]

diff --git a/libguile/ports.c b/libguile/ports.c
index 72bb73a01..002dd1433 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 1995-2001, 2003-2004, 2006-2017
+/* Copyright (C) 1995-2001, 2003-2004, 2006-2018
  * Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
@@ -1543,7 +1543,9 @@ scm_peek_byte_or_eof (SCM port)
   return peek_byte_or_eof (port, &buf, &cur);
 }
 
-static size_t
+/* Like read(2), read *up to* COUNT bytes from PORT into DST, starting
+   at OFFSET.  Return 0 upon EOF.  */
+size_t
 scm_i_read_bytes (SCM port, SCM dst, size_t start, size_t count)
 {
   size_t filled;
diff --git a/libguile/ports.h b/libguile/ports.h
index d131db5be..7aeacc8f9 100644
--- a/libguile/ports.h
+++ b/libguile/ports.h
@@ -4,7 +4,7 @@
 #define SCM_PORTS_H
 
 /* Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2003, 2004,
- *   2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014 Free Software Foundation, Inc.
+ *   2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2018 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -69,6 +69,7 @@ SCM_INTERNAL SCM scm_i_port_weak_set;
 #define SCM_OPOUTPORTP(x) (SCM_OPPORTP (x) && SCM_OUTPUT_PORT_P (x))
 #define SCM_OPENP(x) (SCM_OPPORTP (x))
 #define SCM_CLOSEDP(x) (!SCM_OPENP (x))
+#define SCM_UNBUFFEREDP(x) (SCM_PORTP (x) && (SCM_CELL_WORD_0 (x) & SCM_BUF0))
 #define SCM_CLR_PORT_OPEN_FLAG(p) \
   SCM_SET_CELL_WORD_0 ((p), SCM_CELL_WORD_0 (p) & ~SCM_OPN)
 #ifdef BUILDING_LIBGUILE
@@ -185,6 +186,8 @@ SCM_API int scm_get_byte_or_eof (SCM port);
 SCM_API int scm_peek_byte_or_eof (SCM port);
 SCM_API size_t scm_c_read (SCM port, void *buffer, size_t size);
 SCM_API size_t scm_c_read_bytes (SCM port, SCM dst, size_t start, size_t count);
+SCM_INTERNAL size_t scm_i_read_bytes (SCM port, SCM dst, size_t start,
+				      size_t count);
 SCM_API scm_t_wchar scm_getc (SCM port);
 SCM_API SCM scm_read_char (SCM port);
 
diff --git a/libguile/r6rs-ports.c b/libguile/r6rs-ports.c
index e944c7aab..a3a67f3ca 100644
--- a/libguile/r6rs-ports.c
+++ b/libguile/r6rs-ports.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 2009, 2010, 2011, 2013-2015 Free Software Foundation, Inc.
+/* Copyright (C) 2009, 2010, 2011, 2013-2015, 2018 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -487,16 +487,33 @@ SCM_DEFINE (scm_get_bytevector_some, "get-bytevector-some", 1, 0, 0,
 
   SCM_VALIDATE_BINARY_INPUT_PORT (1, port);
 
-  buf = scm_fill_input (port, 0, &cur, &avail);
-  if (avail == 0)
+  if (SCM_UNBUFFEREDP (port))
     {
-      scm_port_buffer_set_has_eof_p (buf, SCM_BOOL_F);
-      return SCM_EOF_VAL;
+      size_t read;
+
+      bv = scm_c_make_bytevector (4096);
+      read = scm_i_read_bytes (port, bv, 0, SCM_BYTEVECTOR_LENGTH (bv));
+
+      if (read == 0)
+	return SCM_EOF_VAL;
+      else if (read < SCM_BYTEVECTOR_LENGTH (bv))
+	return scm_c_shrink_bytevector (bv, read);
+      else
+	return bv;
     }
+  else
+    {
+      buf = scm_fill_input (port, 0, &cur, &avail);
+      if (avail == 0)
+	{
+	  scm_port_buffer_set_has_eof_p (buf, SCM_BOOL_F);
+	  return SCM_EOF_VAL;
+	}
 
-  bv = scm_c_make_bytevector (avail);
-  scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
-                        avail, cur, avail);
+      bv = scm_c_make_bytevector (avail);
+      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
+			    avail, cur, avail);
+    }
 
   return bv;
 }
diff --git a/test-suite/tests/r6rs-ports.test b/test-suite/tests/r6rs-ports.test
index ba3131f2e..7450b7217 100644
--- a/test-suite/tests/r6rs-ports.test
+++ b/test-suite/tests/r6rs-ports.test
@@ -1,6 +1,6 @@
 ;;;; r6rs-ports.test --- R6RS I/O port tests.   -*- coding: utf-8; -*-
 ;;;;
-;;;; Copyright (C) 2009-2012, 2013-2015 Free Software Foundation, Inc.
+;;;; Copyright (C) 2009-2012, 2013-2015, 2018 Free Software Foundation, Inc.
 ;;;; Ludovic Courtès
 ;;;;
 ;;;; This library is free software; you can redistribute it and/or
@@ -183,6 +183,15 @@
            (equal? (bytevector->u8-list bv)
                    (map char->integer (string->list str))))))
 
+  (pass-if-equal "get-bytevector-some [unbuffered port]"
+      (string->utf8 "Hello, world!")
+    ;; 'get-bytevector-some' used to return a single byte, see
+    ;; <https://bugs.gnu.org/30066>.
+    (call-with-input-string "Hello, world!"
+      (lambda (port)
+        (setvbuf port _IONBF)
+        (get-bytevector-some port))))
+
   (pass-if "get-bytevector-all"
     (let* ((str   "GNU Guile")
            (index 0)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-10 15:59 ` Ludovic Courtès
@ 2018-01-10 16:32   ` Andy Wingo
  2018-01-10 16:58     ` Nala Ginrut
  2018-01-11 14:34     ` Ludovic Courtès
  0 siblings, 2 replies; 15+ messages in thread
From: Andy Wingo @ 2018-01-10 16:32 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 30066

On Wed 10 Jan 2018 16:59, ludo@gnu.org (Ludovic Courtès) writes:

> ludo@gnu.org (Ludovic Courtès) skribis:
>
>> As discussed on IRC, ‘get-bytevector-some’ returns only 1 byte from
>> unbuffered ports:
>
> Here’s a tentative fix.  WDYT?

Thanks!  Needs a little work though :)  Comments inline.

> --- a/libguile/ports.h
> +++ b/libguile/ports.h
> @@ -69,6 +69,7 @@ SCM_INTERNAL SCM scm_i_port_weak_set;
>  #define SCM_OPOUTPORTP(x) (SCM_OPPORTP (x) && SCM_OUTPUT_PORT_P (x))
>  #define SCM_OPENP(x) (SCM_OPPORTP (x))
>  #define SCM_CLOSEDP(x) (!SCM_OPENP (x))
> +#define SCM_UNBUFFEREDP(x) (SCM_PORTP (x) && (SCM_CELL_WORD_0 (x) & SCM_BUF0))
>  #define SCM_CLR_PORT_OPEN_FLAG(p) \
>    SCM_SET_CELL_WORD_0 ((p), SCM_CELL_WORD_0 (p) & ~SCM_OPN)
>  #ifdef BUILDING_LIBGUILE

Please guard this under #ifdef BUILDING_LIBGUILE.

> @@ -487,16 +487,33 @@ SCM_DEFINE (scm_get_bytevector_some, "get-bytevector-some", 1, 0, 0,
>  
>    SCM_VALIDATE_BINARY_INPUT_PORT (1, port);
>  
> -  buf = scm_fill_input (port, 0, &cur, &avail);
> -  if (avail == 0)
> +  if (SCM_UNBUFFEREDP (port))
>      {
> -      scm_port_buffer_set_has_eof_p (buf, SCM_BOOL_F);
> -      return SCM_EOF_VAL;
> +      size_t read;
> +
> +      bv = scm_c_make_bytevector (4096);
> +      read = scm_i_read_bytes (port, bv, 0, SCM_BYTEVECTOR_LENGTH (bv));
> +
> +      if (read == 0)
> +	return SCM_EOF_VAL;
> +      else if (read < SCM_BYTEVECTOR_LENGTH (bv))
> +	return scm_c_shrink_bytevector (bv, read);
> +      else
> +	return bv;
>      }
> +  else
> +    {
> +      buf = scm_fill_input (port, 0, &cur, &avail);
> +      if (avail == 0)
> +	{
> +	  scm_port_buffer_set_has_eof_p (buf, SCM_BOOL_F);
> +	  return SCM_EOF_VAL;
> +	}
>  
> -  bv = scm_c_make_bytevector (avail);
> -  scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
> -                        avail, cur, avail);
> +      bv = scm_c_make_bytevector (avail);
> +      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
> +			    avail, cur, avail);
> +    }
>  
>    return bv;
>  }

There are tabs in your code; would you mind doing only spaces?

A port being unbuffered doesn't mean that it has no bytes in its
buffer.  In particular, scm_unget_bytes may put bytes back into the
buffer.  Or, peek-u8 might fill this buffer with one byte.

Also, they port may have buffered write bytes (could be the port has
write buffering but no read buffering).  In that case (pt->rw_random)
you need to scm_flush().

I suggest taking the buffered bytes from the read buffer, if any.  Then
if the port is unbuffered, make a bytevector and call scm_i_read_bytes;
otherwise do the scm_fill_input path that's there already.

One more thing, if the port goes EOF, you need to
scm_port_buffer_set_has_eof_p.

Regards,

Andy





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-10 16:32   ` Andy Wingo
@ 2018-01-10 16:58     ` Nala Ginrut
  2018-01-10 17:26       ` Andy Wingo
  2018-01-11 14:34     ` Ludovic Courtès
  1 sibling, 1 reply; 15+ messages in thread
From: Nala Ginrut @ 2018-01-10 16:58 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Ludovic Courtès, 30066

hi Andy and Ludo!

What if developers enabled suspendable-ports and set the port to non-blocking?
For example, in the non-blocking asynchronous server, I registered
read/write waiter for suspendable-ports. And save
delimited-continuations then yield the current task.
In this situation, get-bytevector-n! will read n bytes with several
times yielding by the registered read-writer, from the caller's
perspective, get-bytevector-n! will return n bytes finally no matter
how many times it's yielded.
But how about the get-bytevector-some? Should it block just once and
return the first time read m bytes then return?

Thanks!


On Thu, Jan 11, 2018 at 12:32 AM, Andy Wingo <wingo@igalia.com> wrote:
> On Wed 10 Jan 2018 16:59, ludo@gnu.org (Ludovic Courtès) writes:
>
>> ludo@gnu.org (Ludovic Courtès) skribis:
>>
>>> As discussed on IRC, ‘get-bytevector-some’ returns only 1 byte from
>>> unbuffered ports:
>>
>> Here’s a tentative fix.  WDYT?
>
> Thanks!  Needs a little work though :)  Comments inline.
>
>> --- a/libguile/ports.h
>> +++ b/libguile/ports.h
>> @@ -69,6 +69,7 @@ SCM_INTERNAL SCM scm_i_port_weak_set;
>>  #define SCM_OPOUTPORTP(x) (SCM_OPPORTP (x) && SCM_OUTPUT_PORT_P (x))
>>  #define SCM_OPENP(x) (SCM_OPPORTP (x))
>>  #define SCM_CLOSEDP(x) (!SCM_OPENP (x))
>> +#define SCM_UNBUFFEREDP(x) (SCM_PORTP (x) && (SCM_CELL_WORD_0 (x) & SCM_BUF0))
>>  #define SCM_CLR_PORT_OPEN_FLAG(p) \
>>    SCM_SET_CELL_WORD_0 ((p), SCM_CELL_WORD_0 (p) & ~SCM_OPN)
>>  #ifdef BUILDING_LIBGUILE
>
> Please guard this under #ifdef BUILDING_LIBGUILE.
>
>> @@ -487,16 +487,33 @@ SCM_DEFINE (scm_get_bytevector_some, "get-bytevector-some", 1, 0, 0,
>>
>>    SCM_VALIDATE_BINARY_INPUT_PORT (1, port);
>>
>> -  buf = scm_fill_input (port, 0, &cur, &avail);
>> -  if (avail == 0)
>> +  if (SCM_UNBUFFEREDP (port))
>>      {
>> -      scm_port_buffer_set_has_eof_p (buf, SCM_BOOL_F);
>> -      return SCM_EOF_VAL;
>> +      size_t read;
>> +
>> +      bv = scm_c_make_bytevector (4096);
>> +      read = scm_i_read_bytes (port, bv, 0, SCM_BYTEVECTOR_LENGTH (bv));
>> +
>> +      if (read == 0)
>> +     return SCM_EOF_VAL;
>> +      else if (read < SCM_BYTEVECTOR_LENGTH (bv))
>> +     return scm_c_shrink_bytevector (bv, read);
>> +      else
>> +     return bv;
>>      }
>> +  else
>> +    {
>> +      buf = scm_fill_input (port, 0, &cur, &avail);
>> +      if (avail == 0)
>> +     {
>> +       scm_port_buffer_set_has_eof_p (buf, SCM_BOOL_F);
>> +       return SCM_EOF_VAL;
>> +     }
>>
>> -  bv = scm_c_make_bytevector (avail);
>> -  scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
>> -                        avail, cur, avail);
>> +      bv = scm_c_make_bytevector (avail);
>> +      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
>> +                         avail, cur, avail);
>> +    }
>>
>>    return bv;
>>  }
>
> There are tabs in your code; would you mind doing only spaces?
>
> A port being unbuffered doesn't mean that it has no bytes in its
> buffer.  In particular, scm_unget_bytes may put bytes back into the
> buffer.  Or, peek-u8 might fill this buffer with one byte.
>
> Also, they port may have buffered write bytes (could be the port has
> write buffering but no read buffering).  In that case (pt->rw_random)
> you need to scm_flush().
>
> I suggest taking the buffered bytes from the read buffer, if any.  Then
> if the port is unbuffered, make a bytevector and call scm_i_read_bytes;
> otherwise do the scm_fill_input path that's there already.
>
> One more thing, if the port goes EOF, you need to
> scm_port_buffer_set_has_eof_p.
>
> Regards,
>
> Andy
>
>
>





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-10 16:58     ` Nala Ginrut
@ 2018-01-10 17:26       ` Andy Wingo
  2018-01-10 17:43         ` Nala Ginrut
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Wingo @ 2018-01-10 17:26 UTC (permalink / raw)
  To: Nala Ginrut; +Cc: Ludovic Courtès, 30066

On Wed 10 Jan 2018 17:58, Nala Ginrut <nalaginrut@gmail.com> writes:

> hi Andy and Ludo!
>
> What if developers enabled suspendable-ports and set the port to non-blocking?
> For example, in the non-blocking asynchronous server, I registered
> read/write waiter for suspendable-ports. And save
> delimited-continuations then yield the current task.
> In this situation, get-bytevector-n! will read n bytes with several
> times yielding by the registered read-writer, from the caller's
> perspective, get-bytevector-n! will return n bytes finally no matter
> how many times it's yielded.
> But how about the get-bytevector-some? Should it block just once and
> return the first time read m bytes then return?

I think this is right.  At most one block.  FWIW we'd need to add
support for get-bytevector-some to (ice-9 suspendable-ports) to get this
to work.

Andy





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-10 17:26       ` Andy Wingo
@ 2018-01-10 17:43         ` Nala Ginrut
  0 siblings, 0 replies; 15+ messages in thread
From: Nala Ginrut @ 2018-01-10 17:43 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Ludovic Courtès, 30066

Ah, thanks for that work!

On Thu, Jan 11, 2018 at 1:26 AM, Andy Wingo <wingo@igalia.com> wrote:
> On Wed 10 Jan 2018 17:58, Nala Ginrut <nalaginrut@gmail.com> writes:
>
>> hi Andy and Ludo!
>>
>> What if developers enabled suspendable-ports and set the port to non-blocking?
>> For example, in the non-blocking asynchronous server, I registered
>> read/write waiter for suspendable-ports. And save
>> delimited-continuations then yield the current task.
>> In this situation, get-bytevector-n! will read n bytes with several
>> times yielding by the registered read-writer, from the caller's
>> perspective, get-bytevector-n! will return n bytes finally no matter
>> how many times it's yielded.
>> But how about the get-bytevector-some? Should it block just once and
>> return the first time read m bytes then return?
>
> I think this is right.  At most one block.  FWIW we'd need to add
> support for get-bytevector-some to (ice-9 suspendable-ports) to get this
> to work.
>
> Andy





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-10 16:32   ` Andy Wingo
  2018-01-10 16:58     ` Nala Ginrut
@ 2018-01-11 14:34     ` Ludovic Courtès
  2018-01-11 19:55       ` Mark H Weaver
  1 sibling, 1 reply; 15+ messages in thread
From: Ludovic Courtès @ 2018-01-11 14:34 UTC (permalink / raw)
  To: Andy Wingo; +Cc: 30066

[-- Attachment #1: Type: text/plain, Size: 904 bytes --]

Hello,

Andy Wingo <wingo@igalia.com> skribis:

> There are tabs in your code; would you mind doing only spaces?
>
> A port being unbuffered doesn't mean that it has no bytes in its
> buffer.  In particular, scm_unget_bytes may put bytes back into the
> buffer.  Or, peek-u8 might fill this buffer with one byte.
>
> Also, they port may have buffered write bytes (could be the port has
> write buffering but no read buffering).  In that case (pt->rw_random)
> you need to scm_flush().
>
> I suggest taking the buffered bytes from the read buffer, if any.  Then
> if the port is unbuffered, make a bytevector and call scm_i_read_bytes;
> otherwise do the scm_fill_input path that's there already.
>
> One more thing, if the port goes EOF, you need to
> scm_port_buffer_set_has_eof_p.

I think the attached patch addresses these issues.  WDYT?

Thanks for the review!

Ludo’.


[-- Attachment #2: the patch --]
[-- Type: text/x-patch, Size: 7736 bytes --]

From d3a60bac6c6aae62ced6eec21b3865caaab83bb8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>
Date: Thu, 11 Jan 2018 15:29:55 +0100
Subject: [PATCH] 'get-bytevector-some' reads as much as possible without
 blocking.

Fixes <https://bugs.gnu.org/30066>.

* libguile/ports.c (scm_i_read_bytes): Remove 'static' keyword.
* libguile/ports.h (SCM_UNBUFFEREDP): New macro.
(scm_i_read_bytes): New declaration.
* libguile/r6rs-ports.c (scm_get_bytevector_some): When PORT is
unbuffered, invoke 'scm_i_read_bytes' to read as much as we can.
* test-suite/tests/r6rs-ports.test ("8.2.8 Binary Input")
["get-bytevector-some [unbuffered port]"]
["get-bytevector-some [unbuffered port, lookahead-u8]"]
["get-bytevector-some [unbuffered port, unget-bytevector]"]: New tests.
---
 libguile/ports.c                 |  6 ++++--
 libguile/ports.h                 |  7 ++++++-
 libguile/r6rs-ports.c            | 34 ++++++++++++++++++++++++++++------
 test-suite/tests/r6rs-ports.test | 36 ++++++++++++++++++++++++++++++++++--
 4 files changed, 72 insertions(+), 11 deletions(-)

diff --git a/libguile/ports.c b/libguile/ports.c
index 72bb73a01..002dd1433 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 1995-2001, 2003-2004, 2006-2017
+/* Copyright (C) 1995-2001, 2003-2004, 2006-2018
  * Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
@@ -1543,7 +1543,9 @@ scm_peek_byte_or_eof (SCM port)
   return peek_byte_or_eof (port, &buf, &cur);
 }
 
-static size_t
+/* Like read(2), read *up to* COUNT bytes from PORT into DST, starting
+   at OFFSET.  Return 0 upon EOF.  */
+size_t
 scm_i_read_bytes (SCM port, SCM dst, size_t start, size_t count)
 {
   size_t filled;
diff --git a/libguile/ports.h b/libguile/ports.h
index d131db5be..3fe64c27d 100644
--- a/libguile/ports.h
+++ b/libguile/ports.h
@@ -4,7 +4,7 @@
 #define SCM_PORTS_H
 
 /* Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2003, 2004,
- *   2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014 Free Software Foundation, Inc.
+ *   2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2018 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -71,7 +71,10 @@ SCM_INTERNAL SCM scm_i_port_weak_set;
 #define SCM_CLOSEDP(x) (!SCM_OPENP (x))
 #define SCM_CLR_PORT_OPEN_FLAG(p) \
   SCM_SET_CELL_WORD_0 ((p), SCM_CELL_WORD_0 (p) & ~SCM_OPN)
+
 #ifdef BUILDING_LIBGUILE
+#define SCM_UNBUFFEREDP(x)				\
+  (SCM_PORTP (x) && (SCM_CELL_WORD_0 (x) & SCM_BUF0))
 #define SCM_PORT_FINALIZING_P(x) \
   (SCM_CELL_WORD_0 (x) & SCM_F_PORT_FINALIZING)
 #define SCM_SET_PORT_FINALIZING(p) \
@@ -185,6 +188,8 @@ SCM_API int scm_get_byte_or_eof (SCM port);
 SCM_API int scm_peek_byte_or_eof (SCM port);
 SCM_API size_t scm_c_read (SCM port, void *buffer, size_t size);
 SCM_API size_t scm_c_read_bytes (SCM port, SCM dst, size_t start, size_t count);
+SCM_INTERNAL size_t scm_i_read_bytes (SCM port, SCM dst, size_t start,
+				      size_t count);
 SCM_API scm_t_wchar scm_getc (SCM port);
 SCM_API SCM scm_read_char (SCM port);
 
diff --git a/libguile/r6rs-ports.c b/libguile/r6rs-ports.c
index e944c7aab..a3d638ca0 100644
--- a/libguile/r6rs-ports.c
+++ b/libguile/r6rs-ports.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 2009, 2010, 2011, 2013-2015 Free Software Foundation, Inc.
+/* Copyright (C) 2009, 2010, 2011, 2013-2015, 2018 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -481,9 +481,9 @@ SCM_DEFINE (scm_get_bytevector_some, "get-bytevector-some", 1, 0, 0,
             "position to point just past these bytes.")
 #define FUNC_NAME s_scm_get_bytevector_some
 {
-  SCM buf;
+  SCM buf, bv;
   size_t cur, avail;
-  SCM bv;
+  const size_t max_buffer_size = 4096;
 
   SCM_VALIDATE_BINARY_INPUT_PORT (1, port);
 
@@ -494,9 +494,31 @@ SCM_DEFINE (scm_get_bytevector_some, "get-bytevector-some", 1, 0, 0,
       return SCM_EOF_VAL;
     }
 
-  bv = scm_c_make_bytevector (avail);
-  scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
-                        avail, cur, avail);
+  if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size))
+    {
+      /* PORT is unbuffered.  Read as much as possible from PORT.  */
+      size_t read;
+
+      bv = scm_c_make_bytevector (max_buffer_size);
+      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
+                            avail, cur, avail);
+
+      read = scm_i_read_bytes (port, bv, avail,
+                               SCM_BYTEVECTOR_LENGTH (bv) - avail);
+
+      if (read == 0)
+        scm_port_buffer_set_has_eof_p (buf, SCM_BOOL_F);
+
+      if (read + avail < SCM_BYTEVECTOR_LENGTH (bv))
+        bv = scm_c_shrink_bytevector (bv, read + avail);
+    }
+  else
+    {
+      /* Return what's already buffered.  */
+      bv = scm_c_make_bytevector (avail);
+      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
+                            avail, cur, avail);
+    }
 
   return bv;
 }
diff --git a/test-suite/tests/r6rs-ports.test b/test-suite/tests/r6rs-ports.test
index ba3131f2e..d5476f20e 100644
--- a/test-suite/tests/r6rs-ports.test
+++ b/test-suite/tests/r6rs-ports.test
@@ -1,6 +1,6 @@
 ;;;; r6rs-ports.test --- R6RS I/O port tests.   -*- coding: utf-8; -*-
 ;;;;
-;;;; Copyright (C) 2009-2012, 2013-2015 Free Software Foundation, Inc.
+;;;; Copyright (C) 2009-2012, 2013-2015, 2018 Free Software Foundation, Inc.
 ;;;; Ludovic Courtès
 ;;;;
 ;;;; This library is free software; you can redistribute it and/or
@@ -26,7 +26,8 @@
   #:use-module (rnrs io ports)
   #:use-module (rnrs io simple)
   #:use-module (rnrs exceptions)
-  #:use-module (rnrs bytevectors))
+  #:use-module (rnrs bytevectors)
+  #:use-module ((ice-9 binary-ports) #:select (unget-bytevector)))
 
 (define-syntax pass-if-condition
   (syntax-rules ()
@@ -183,6 +184,37 @@
            (equal? (bytevector->u8-list bv)
                    (map char->integer (string->list str))))))
 
+  (pass-if-equal "get-bytevector-some [unbuffered port]"
+      (string->utf8 "Hello, world!")
+    ;; 'get-bytevector-some' used to return a single byte, see
+    ;; <https://bugs.gnu.org/30066>.
+    (call-with-input-string "Hello, world!"
+      (lambda (port)
+        (setvbuf port _IONBF)
+        (get-bytevector-some port))))
+
+  (pass-if-equal "get-bytevector-some [unbuffered port, lookahead-u8]"
+      (string->utf8 "Hello, world!")
+    (call-with-input-string "Hello, world!"
+      (lambda (port)
+        (setvbuf port _IONBF)
+
+        ;; 'lookahead-u8' fills in PORT's 1-byte buffer.  Yet,
+        ;; 'get-bytevector-some' should return the whole thing.
+        (and (eqv? (lookahead-u8 port) (char->integer #\H))
+             (get-bytevector-some port)))))
+
+  (pass-if-equal "get-bytevector-some [unbuffered port, unget-bytevector]"
+      (string->utf8 "Hello")
+    (call-with-input-string "Hello, world!"
+      (lambda (port)
+        (setvbuf port _IONBF)
+        ;; 'unget-bytevector' fills the putback buffer, and
+        ;; 'get-bytevector-some' should get data from there.
+        (unget-bytevector port (get-bytevector-all port)
+                          0 5)
+        (get-bytevector-some port))))
+
   (pass-if "get-bytevector-all"
     (let* ((str   "GNU Guile")
            (index 0)
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-11 14:34     ` Ludovic Courtès
@ 2018-01-11 19:55       ` Mark H Weaver
  2018-01-11 21:02         ` Ludovic Courtès
  0 siblings, 1 reply; 15+ messages in thread
From: Mark H Weaver @ 2018-01-11 19:55 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Andy Wingo, 30066

Hi Ludovic,

ludo@gnu.org (Ludovic Courtès) writes:

> Andy Wingo <wingo@igalia.com> skribis:
>
>> I suggest taking the buffered bytes from the read buffer, if any.  Then
>> if the port is unbuffered, make a bytevector and call scm_i_read_bytes;
>> otherwise do the scm_fill_input path that's there already.
>>
>> One more thing, if the port goes EOF, you need to
>> scm_port_buffer_set_has_eof_p.
>
> I think the attached patch addresses these issues.  WDYT?

[...]

> diff --git a/libguile/r6rs-ports.c b/libguile/r6rs-ports.c
> index e944c7aab..a3d638ca0 100644
> --- a/libguile/r6rs-ports.c
> +++ b/libguile/r6rs-ports.c
> @@ -1,4 +1,4 @@
> -/* Copyright (C) 2009, 2010, 2011, 2013-2015 Free Software Foundation, Inc.
> +/* Copyright (C) 2009, 2010, 2011, 2013-2015, 2018 Free Software Foundation, Inc.
>   *
>   * This library is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU Lesser General Public License
> @@ -481,9 +481,9 @@ SCM_DEFINE (scm_get_bytevector_some, "get-bytevector-some", 1, 0, 0,
>              "position to point just past these bytes.")
>  #define FUNC_NAME s_scm_get_bytevector_some
>  {
> -  SCM buf;
> +  SCM buf, bv;
>    size_t cur, avail;
> -  SCM bv;
> +  const size_t max_buffer_size = 4096;
>  
>    SCM_VALIDATE_BINARY_INPUT_PORT (1, port);
>  
> @@ -494,9 +494,31 @@ SCM_DEFINE (scm_get_bytevector_some, "get-bytevector-some", 1, 0, 0,
>        return SCM_EOF_VAL;
>      }
>  
> -  bv = scm_c_make_bytevector (avail);
> -  scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
> -                        avail, cur, avail);
> +  if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size))
> +    {
> +      /* PORT is unbuffered.  Read as much as possible from PORT.  */
> +      size_t read;
> +
> +      bv = scm_c_make_bytevector (max_buffer_size);
> +      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
> +                            avail, cur, avail);
> +
> +      read = scm_i_read_bytes (port, bv, avail,
> +                               SCM_BYTEVECTOR_LENGTH (bv) - avail);

Here's the R6RS specification for 'get-bytevector-some':

  "Reads from BINARY-INPUT-PORT, blocking as necessary, until bytes are
   available from BINARY-INPUT-PORT or until an end of file is reached.
   If bytes become available, 'get-bytevector-some' returns a freshly
   allocated bytevector containing the initial available bytes (at least
   one), and it updates BINARY-INPUT-PORT to point just past these
   bytes.  If no input bytes are seen before an end of file is reached,
   the end-of-file object is returned."

By my reading of this, we should block only if necessary to ensure that
we return at least one byte (or EOF).  In other words, if we can return
at least one byte (or EOF), then we must not block, which means that we
must not initiate another 'read'.

Out of curiosity, is there a reason why you're using an unbuffered port
in your use case?

       Mark





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-11 19:55       ` Mark H Weaver
@ 2018-01-11 21:02         ` Ludovic Courtès
  2018-01-11 21:55           ` Mark H Weaver
  0 siblings, 1 reply; 15+ messages in thread
From: Ludovic Courtès @ 2018-01-11 21:02 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Andy Wingo, 30066

Hello,

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:

[...]

>> +  if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size))
>> +    {
>> +      /* PORT is unbuffered.  Read as much as possible from PORT.  */
>> +      size_t read;
>> +
>> +      bv = scm_c_make_bytevector (max_buffer_size);
>> +      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
>> +                            avail, cur, avail);
>> +
>> +      read = scm_i_read_bytes (port, bv, avail,
>> +                               SCM_BYTEVECTOR_LENGTH (bv) - avail);
>
> Here's the R6RS specification for 'get-bytevector-some':
>
>   "Reads from BINARY-INPUT-PORT, blocking as necessary, until bytes are
>    available from BINARY-INPUT-PORT or until an end of file is reached.
>    If bytes become available, 'get-bytevector-some' returns a freshly
>    allocated bytevector containing the initial available bytes (at least
>    one), and it updates BINARY-INPUT-PORT to point just past these
>    bytes.  If no input bytes are seen before an end of file is reached,
>    the end-of-file object is returned."
>
> By my reading of this, we should block only if necessary to ensure that
> we return at least one byte (or EOF).  In other words, if we can return
> at least one byte (or EOF), then we must not block, which means that we
> must not initiate another 'read'.

Indeed.  So perhaps the condition above should be changed to:

  if (SCM_UNBUFFEREDP (port) && (avail == 0))

?

> Out of curiosity, is there a reason why you're using an unbuffered port
> in your use case?

It’s to implement redirect à la socat:

  https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-11 21:02         ` Ludovic Courtès
@ 2018-01-11 21:55           ` Mark H Weaver
  2018-01-12  9:01             ` Andy Wingo
  0 siblings, 1 reply; 15+ messages in thread
From: Mark H Weaver @ 2018-01-11 21:55 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Andy Wingo, 30066

ludo@gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> ludo@gnu.org (Ludovic Courtès) writes:
>
> [...]
>
>>> +  if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size))
>>> +    {
>>> +      /* PORT is unbuffered.  Read as much as possible from PORT.  */
>>> +      size_t read;
>>> +
>>> +      bv = scm_c_make_bytevector (max_buffer_size);
>>> +      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
>>> +                            avail, cur, avail);
>>> +
>>> +      read = scm_i_read_bytes (port, bv, avail,
>>> +                               SCM_BYTEVECTOR_LENGTH (bv) - avail);
>>
>> Here's the R6RS specification for 'get-bytevector-some':
>>
>>   "Reads from BINARY-INPUT-PORT, blocking as necessary, until bytes are
>>    available from BINARY-INPUT-PORT or until an end of file is reached.
>>    If bytes become available, 'get-bytevector-some' returns a freshly
>>    allocated bytevector containing the initial available bytes (at least
>>    one), and it updates BINARY-INPUT-PORT to point just past these
>>    bytes.  If no input bytes are seen before an end of file is reached,
>>    the end-of-file object is returned."
>>
>> By my reading of this, we should block only if necessary to ensure that
>> we return at least one byte (or EOF).  In other words, if we can return
>> at least one byte (or EOF), then we must not block, which means that we
>> must not initiate another 'read'.
>
> Indeed.  So perhaps the condition above should be changed to:
>
>   if (SCM_UNBUFFEREDP (port) && (avail == 0))
>
> ?

That won't work, because the earlier call to 'scm_fill_input' will have
already initiated a 'read' if the buffer was empty.  The read buffer
size will determine the maximum number of bytes read, which will be 1 in
the case of an unbuffered port.  So, at the point of this condition,
'avail == 0' will occur only if EOF was encountered, in which case you
must return EOF without attempting another 'read'.

In order to avoid unnecessary blocking, there must be only one 'read'
call, and it must be initiated only if the buffer was already empty.

So, in order to accomplish your goal here, I don't see how you can use
'scm_fill_input', unless you temporarily increase the size of the read
buffer beforehand.

Instead, I think you need to first check if the read buffer contains any
bytes.  If so, empty the buffer and return them.  If the buffer is
empty, the next thing to check is 'scm_port_buffer_has_eof_p'.  If it's
set, then clear that flag and return EOF.

Otherwise, if the buffer is empty and 'scm_port_buffer_has_eof_p' is
false, then you must do what 'scm_fill_input' would have done, except
using your larger buffer instead of the port's internal read buffer.  In
particular, you must first switch the port to "reading" mode, flushing
the write buffer if 'rw_random' is set.

Also, I'd prefer to move this code to ports.c in order to avoid adding
more internal declarations to ports.h and changing more functions from
'static' to global functions.

>> Out of curiosity, is there a reason why you're using an unbuffered port
>> in your use case?
>
> It’s to implement redirect à la socat:
>
>   https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447

Why is an unbuffered port being used here?  Can we change it to a
buffered port?

      Mark





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-11 21:55           ` Mark H Weaver
@ 2018-01-12  9:01             ` Andy Wingo
  2018-01-12 10:15               ` Ludovic Courtès
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Wingo @ 2018-01-12  9:01 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Ludovic Courtès, 30066

On Thu 11 Jan 2018 22:55, Mark H Weaver <mhw@netris.org> writes:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> ludo@gnu.org (Ludovic Courtès) writes:
>>
>> [...]
>>
>>>> +  if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size))
>>>> +    {
>>>> +      /* PORT is unbuffered.  Read as much as possible from PORT.  */
>>>> +      size_t read;
>>>> +
>>>> +      bv = scm_c_make_bytevector (max_buffer_size);
>>>> +      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
>>>> +                            avail, cur, avail);
>>>> +
>>>> +      read = scm_i_read_bytes (port, bv, avail,
>>>> +                               SCM_BYTEVECTOR_LENGTH (bv) - avail);
>>>
>>> Here's the R6RS specification for 'get-bytevector-some':
>>>
>>>   "Reads from BINARY-INPUT-PORT, blocking as necessary, until bytes are
>>>    available from BINARY-INPUT-PORT or until an end of file is reached.
>>>    If bytes become available, 'get-bytevector-some' returns a freshly
>>>    allocated bytevector containing the initial available bytes (at least
>>>    one), and it updates BINARY-INPUT-PORT to point just past these
>>>    bytes.  If no input bytes are seen before an end of file is reached,
>>>    the end-of-file object is returned."
>>>
>>> By my reading of this, we should block only if necessary to ensure that
>>> we return at least one byte (or EOF).  In other words, if we can return
>>> at least one byte (or EOF), then we must not block, which means that we
>>> must not initiate another 'read'.
>>
>> Indeed.  So perhaps the condition above should be changed to:
>>
>>   if (SCM_UNBUFFEREDP (port) && (avail == 0))
>>
>> ?
>
> That won't work, because the earlier call to 'scm_fill_input' will have
> already initiated a 'read' if the buffer was empty.  The read buffer
> size will determine the maximum number of bytes read, which will be 1 in
> the case of an unbuffered port.  So, at the point of this condition,
> 'avail == 0' will occur only if EOF was encountered, in which case you
> must return EOF without attempting another 'read'.
>
> In order to avoid unnecessary blocking, there must be only one 'read'
> call, and it must be initiated only if the buffer was already empty.
>
> So, in order to accomplish your goal here, I don't see how you can use
> 'scm_fill_input', unless you temporarily increase the size of the read
> buffer beforehand.
>
> Instead, I think you need to first check if the read buffer contains any
> bytes.  If so, empty the buffer and return them.  If the buffer is
> empty, the next thing to check is 'scm_port_buffer_has_eof_p'.  If it's
> set, then clear that flag and return EOF.
>
> Otherwise, if the buffer is empty and 'scm_port_buffer_has_eof_p' is
> false, then you must do what 'scm_fill_input' would have done, except
> using your larger buffer instead of the port's internal read buffer.  In
> particular, you must first switch the port to "reading" mode, flushing
> the write buffer if 'rw_random' is set.
>
> Also, I'd prefer to move this code to ports.c in order to avoid adding
> more internal declarations to ports.h and changing more functions from
> 'static' to global functions.

I agree with Mark here -- thanks for the close review.

>>> Out of curiosity, is there a reason why you're using an unbuffered port
>>> in your use case?
>>
>> It’s to implement redirect à la socat:
>>
>>   https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447
>
> Why is an unbuffered port being used here?  Can we change it to a
> buffered port?

This was also a question I had!  If you make it a buffered port at 4096
bytes (for example), then get-bytevector-some works exactly like you
want it to, no?

Andy





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-12  9:01             ` Andy Wingo
@ 2018-01-12 10:15               ` Ludovic Courtès
  2018-01-12 10:33                 ` Andy Wingo
  0 siblings, 1 reply; 15+ messages in thread
From: Ludovic Courtès @ 2018-01-12 10:15 UTC (permalink / raw)
  To: Andy Wingo; +Cc: 30066

Andy Wingo <wingo@igalia.com> skribis:

> On Thu 11 Jan 2018 22:55, Mark H Weaver <mhw@netris.org> writes:

[...]

>>>> Out of curiosity, is there a reason why you're using an unbuffered port
>>>> in your use case?
>>>
>>> It’s to implement redirect à la socat:
>>>
>>>   https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447
>>
>> Why is an unbuffered port being used here?  Can we change it to a
>> buffered port?
>
> This was also a question I had!  If you make it a buffered port at 4096
> bytes (for example), then get-bytevector-some works exactly like you
> want it to, no?

It might work, but that’s more by chance no?

I mean, if we declare the port as buffered, then we give the I/O
routines the “right” to fill in that buffer.

WDYT?

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-12 10:15               ` Ludovic Courtès
@ 2018-01-12 10:33                 ` Andy Wingo
  2018-01-13 20:53                   ` Ludovic Courtès
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Wingo @ 2018-01-12 10:33 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 30066

On Fri 12 Jan 2018 11:15, ludo@gnu.org (Ludovic Courtès) writes:

> Andy Wingo <wingo@igalia.com> skribis:
>
>> On Thu 11 Jan 2018 22:55, Mark H Weaver <mhw@netris.org> writes:
>
> [...]
>
>>>>> Out of curiosity, is there a reason why you're using an unbuffered port
>>>>> in your use case?
>>>>
>>>> It’s to implement redirect à la socat:
>>>>
>>>>   https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447
>>>
>>> Why is an unbuffered port being used here?  Can we change it to a
>>> buffered port?
>>
>> This was also a question I had!  If you make it a buffered port at 4096
>> bytes (for example), then get-bytevector-some works exactly like you
>> want it to, no?
>
> It might work, but that’s more by chance no?

No, it is reliable.  get-bytevector-some on a buffered port must either
return all the buffered bytes or perform exactly one read (up to the
buffer size) and either return those bytes or EOF.  As far as I
understand, that is exactly what you want.

Using buffered ports has two additional advantages: you get to specify
the read size, and returned bytevectors can be allocated to precisely
the right size (no need to overallocate then truncate).

Andy





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-12 10:33                 ` Andy Wingo
@ 2018-01-13 20:53                   ` Ludovic Courtès
  2018-02-16 13:19                     ` Ludovic Courtès
  0 siblings, 1 reply; 15+ messages in thread
From: Ludovic Courtès @ 2018-01-13 20:53 UTC (permalink / raw)
  To: Andy Wingo; +Cc: 30066

Hey,

Andy Wingo <wingo@igalia.com> skribis:

> On Fri 12 Jan 2018 11:15, ludo@gnu.org (Ludovic Courtès) writes:
>
>> Andy Wingo <wingo@igalia.com> skribis:
>>
>>> On Thu 11 Jan 2018 22:55, Mark H Weaver <mhw@netris.org> writes:
>>
>> [...]
>>
>>>>>> Out of curiosity, is there a reason why you're using an unbuffered port
>>>>>> in your use case?
>>>>>
>>>>> It’s to implement redirect à la socat:
>>>>>
>>>>>   https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447
>>>>
>>>> Why is an unbuffered port being used here?  Can we change it to a
>>>> buffered port?
>>>
>>> This was also a question I had!  If you make it a buffered port at 4096
>>> bytes (for example), then get-bytevector-some works exactly like you
>>> want it to, no?
>>
>> It might work, but that’s more by chance no?
>
> No, it is reliable.  get-bytevector-some on a buffered port must either
> return all the buffered bytes or perform exactly one read (up to the
> buffer size) and either return those bytes or EOF.  As far as I
> understand, that is exactly what you want.

Indeed, that works well, thanks!  So, after all, problem solved?

I think the confusion for me comes from the fact that we don’t have a
FILE*/fd distinction like in C.  It’s as if we were always using FILE*
in the sense that I’m never sure what’s going to happen or whether a
particular behavior can be relied on.

Thank you,
Ludo’.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
  2018-01-13 20:53                   ` Ludovic Courtès
@ 2018-02-16 13:19                     ` Ludovic Courtès
  0 siblings, 0 replies; 15+ messages in thread
From: Ludovic Courtès @ 2018-02-16 13:19 UTC (permalink / raw)
  To: Andy Wingo; +Cc: 30066

ludo@gnu.org (Ludovic Courtès) skribis:

> Andy Wingo <wingo@igalia.com> skribis:
>
>> On Fri 12 Jan 2018 11:15, ludo@gnu.org (Ludovic Courtès) writes:
>>
>>> Andy Wingo <wingo@igalia.com> skribis:
>>>
>>>> On Thu 11 Jan 2018 22:55, Mark H Weaver <mhw@netris.org> writes:
>>>
>>> [...]
>>>
>>>>>>> Out of curiosity, is there a reason why you're using an unbuffered port
>>>>>>> in your use case?
>>>>>>
>>>>>> It’s to implement redirect à la socat:
>>>>>>
>>>>>>   https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447
>>>>>
>>>>> Why is an unbuffered port being used here?  Can we change it to a
>>>>> buffered port?
>>>>
>>>> This was also a question I had!  If you make it a buffered port at 4096
>>>> bytes (for example), then get-bytevector-some works exactly like you
>>>> want it to, no?
>>>
>>> It might work, but that’s more by chance no?
>>
>> No, it is reliable.  get-bytevector-some on a buffered port must either
>> return all the buffered bytes or perform exactly one read (up to the
>> buffer size) and either return those bytes or EOF.  As far as I
>> understand, that is exactly what you want.
>
> Indeed, that works well, thanks!  So, after all, problem solved?

I’m closing this as not-a-bug.

Ludo’.





^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-02-16 13:19 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-10 15:02 bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports Ludovic Courtès
2018-01-10 15:59 ` Ludovic Courtès
2018-01-10 16:32   ` Andy Wingo
2018-01-10 16:58     ` Nala Ginrut
2018-01-10 17:26       ` Andy Wingo
2018-01-10 17:43         ` Nala Ginrut
2018-01-11 14:34     ` Ludovic Courtès
2018-01-11 19:55       ` Mark H Weaver
2018-01-11 21:02         ` Ludovic Courtès
2018-01-11 21:55           ` Mark H Weaver
2018-01-12  9:01             ` Andy Wingo
2018-01-12 10:15               ` Ludovic Courtès
2018-01-12 10:33                 ` Andy Wingo
2018-01-13 20:53                   ` Ludovic Courtès
2018-02-16 13:19                     ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).