* [PATCH] Make get-bytevector-all suspendable.
@ 2023-07-20 14:01 Christopher Baines
2024-06-01 9:11 ` Ludovic Courtès
0 siblings, 1 reply; 6+ messages in thread
From: Christopher Baines @ 2023-07-20 14:01 UTC (permalink / raw)
To: guile-devel
I'm looking at this since it's used in (web response)
read-response-body.
* module/ice-9/suspendable-ports.scm (get-bytevector-all): New
procedure.
(port-bindings): Add it.
---
module/ice-9/suspendable-ports.scm | 36 ++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/module/ice-9/suspendable-ports.scm b/module/ice-9/suspendable-ports.scm
index 9fac1df62..e5b3de982 100644
--- a/module/ice-9/suspendable-ports.scm
+++ b/module/ice-9/suspendable-ports.scm
@@ -50,6 +50,7 @@
(define-module (ice-9 suspendable-ports)
#:use-module (rnrs bytevectors)
+ #:use-module (rnrs bytevectors gnu)
#:use-module (ice-9 ports internal)
#:use-module (ice-9 match)
#:export (current-read-waiter
@@ -342,6 +343,40 @@
(set-port-buffer-cur! buf (+ cur transfer-size))
transfer-size))))))
+(define (get-bytevector-all port)
+ (define %initial-length 4096)
+
+ (let read-loop ((total 0)
+ (result-length %initial-length)
+ (result (make-bytevector %initial-length)))
+ (call-with-values (lambda () (fill-input port 1 'binary))
+ (lambda (buf cur buffered)
+ (if (zero? buffered)
+ (begin
+ (set-port-buffer-has-eof?! buf #f)
+ (if (= total 0)
+ the-eof-object
+ (bytevector-slice result 0 total)))
+ (let* ((new-total (+ total buffered))
+ (new-result-length
+ (let loop ((new-result-length result-length))
+ (if (< new-total new-result-length)
+ new-result-length
+ (loop (* 2 new-result-length)))))
+ (new-result
+ (if (= new-result-length result-length)
+ result
+ (let ((new-result (make-bytevector new-result-length)))
+ (bytevector-copy! result 0 new-result 0 total)
+ new-result))))
+ (bytevector-copy! (port-buffer-bytevector buf) cur
+ new-result total
+ buffered)
+ (set-port-buffer-cur! buf (+ cur buffered))
+ (read-loop new-total
+ new-result-length
+ new-result)))))))
+
(define (put-u8 port byte)
(let* ((buf (port-write-buffer port))
(bv (port-buffer-bytevector buf))
@@ -754,6 +789,7 @@
((ice-9 binary-ports)
get-u8 lookahead-u8 get-bytevector-n get-bytevector-n!
get-bytevector-some get-bytevector-some!
+ get-bytevector-all
put-u8 put-bytevector)
((ice-9 textual-ports)
put-char put-string)
--
2.41.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] Make get-bytevector-all suspendable.
2023-07-20 14:01 [PATCH] Make get-bytevector-all suspendable Christopher Baines
@ 2024-06-01 9:11 ` Ludovic Courtès
2024-06-01 9:40 ` Maxime Devos
2024-06-01 9:44 ` Maxime Devos
0 siblings, 2 replies; 6+ messages in thread
From: Ludovic Courtès @ 2024-06-01 9:11 UTC (permalink / raw)
To: Christopher Baines; +Cc: guile-devel
[-- Attachment #1: Type: text/plain, Size: 718 bytes --]
Hi Chris,
Christopher Baines <mail@cbaines.net> skribis:
> I'm looking at this since it's used in (web response)
> read-response-body.
>
> * module/ice-9/suspendable-ports.scm (get-bytevector-all): New
> procedure.
> (port-bindings): Add it.
Given that ‘get-bytevector-n!’ already has a variant in
suspendable-ports.scm, my preference would be to rewrite
‘get-bytevector-all’ in Scheme (patch attached). That way, it would
naturally be suspendable. (It’s also in line with the general strategy
of moving things to Scheme.)
I don’t expect significant performance difference compared to the C
implementation since that is dominated by allocations and I/O.
Thoughts?
Ludo’.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 4277 bytes --]
diff --git a/libguile/r6rs-ports.c b/libguile/r6rs-ports.c
index 7c51bf617..ffa1e1b2b 100644
--- a/libguile/r6rs-ports.c
+++ b/libguile/r6rs-ports.c
@@ -1,4 +1,4 @@
-/* Copyright 2009-2011,2013-2015,2018-2019,2023
+/* Copyright 2009-2011,2013-2015,2018-2019,2023,2024
Free Software Foundation, Inc.
This file is part of Guile.
@@ -393,58 +393,23 @@ SCM_DEFINE (scm_get_bytevector_some_x, "get-bytevector-some!", 4, 0, 0,
}
#undef FUNC_NAME
-SCM_DEFINE (scm_get_bytevector_all, "get-bytevector-all", 1, 0, 0,
- (SCM port),
- "Read from @var{port}, blocking as necessary, until "
- "the end-of-file is reached. Return either "
- "a new bytevector containing the data read or the "
- "end-of-file object (if no data were available).")
-#define FUNC_NAME s_scm_get_bytevector_all
-{
- SCM result;
- size_t c_len, c_count;
- size_t c_read, c_total;
-
- SCM_VALIDATE_BINARY_INPUT_PORT (1, port);
+static SCM get_bytevector_all_var;
- c_len = c_count = 4096;
- result = scm_c_make_bytevector (c_count);
- c_total = c_read = 0;
-
- do
- {
- if (c_read > c_len - c_total)
- {
- /* Grow the bytevector. */
- SCM prev = result;
-
- if (INT_ADD_OVERFLOW (c_len, c_len))
- scm_num_overflow (FUNC_NAME);
-
- result = scm_c_make_bytevector (c_len * 2);
- memcpy (SCM_BYTEVECTOR_CONTENTS (result),
- SCM_BYTEVECTOR_CONTENTS (prev),
- c_total);
- c_count = c_len;
- c_len *= 2;
- }
-
- /* `scm_c_read ()' blocks until C_COUNT bytes are available or EOF is
- reached. */
- c_read = scm_c_read_bytes (port, result, c_total, c_count);
- c_total += c_read, c_count -= c_read;
- }
- while (c_count == 0);
-
- if (c_total == 0)
- return SCM_EOF_VAL;
+static void
+init_bytevector_io_vars (void)
+{
+ get_bytevector_all_var =
+ scm_c_public_lookup ("ice-9 binary-port", "get-bytevector-all");
+}
- if (c_len > c_total)
- return scm_c_shrink_bytevector (result, c_total);
+SCM
+scm_get_bytevector_all (SCM port)
+{
+ static scm_i_pthread_once_t once = SCM_I_PTHREAD_ONCE_INIT;
+ scm_i_pthread_once (&once, init_bytevector_io_vars);
- return result;
+ return scm_call_1 (scm_variable_ref (get_bytevector_all_var), port);
}
-#undef FUNC_NAME
\f
diff --git a/module/ice-9/binary-ports.scm b/module/ice-9/binary-ports.scm
index b7eddc93d..864d9ef9a 100644
--- a/module/ice-9/binary-ports.scm
+++ b/module/ice-9/binary-ports.scm
@@ -1,5 +1,5 @@
;;; binary-ports.scm --- Binary IO on ports
-;;; Copyright (C) 2009-2011,2013,2016,2019,2021,2023 Free Software Foundation, Inc.
+;;; Copyright (C) 2009-2011,2013,2016,2019,2021,2023,2024 Free Software Foundation, Inc.
;;;
;;; This library is free software: you can redistribute it and/or modify
;;; it under the terms of the GNU Lesser General Public License as
@@ -27,6 +27,7 @@
(define-module (ice-9 binary-ports)
#:use-module (rnrs bytevectors)
+ #:autoload (rnrs bytevectors gnu) (bytevector-slice)
#:use-module (ice-9 match)
#:use-module (ice-9 custom-ports)
#:export (eof-object
@@ -180,3 +181,29 @@ bytevector composed of the bytes written into the port is returned."
;; FIXME: Instead default to current encoding, if
;; someone reads text from this port.
#:encoding 'ISO-8859-1 #:conversion-strategy 'error))
+
+\f
+;;;
+;;; Binary input.
+;;;
+
+(define (get-bytevector-all port)
+ "Read from @var{port}, blocking as necessary, until
+the end-of-file is reached. Return either a new bytevector containing
+the data read or the end-of-file object (if no data were available)."
+ (define initial-capacity 4096)
+
+ (let loop ((bv (make-bytevector initial-capacity))
+ (capacity initial-capacity)
+ (size 0))
+ (match (get-bytevector-n! port bv size (- capacity size))
+ ((? eof-object?)
+ (bytevector-slice bv 0 size))
+ (read
+ (let ((size (+ read size)))
+ (if (= capacity size)
+ (let* ((capacity (* capacity 2))
+ (new (make-bytevector capacity)))
+ (bytevector-copy! bv 0 new 0 size)
+ (loop new capacity size))
+ (loop bv capacity size)))))))
^ permalink raw reply related [flat|nested] 6+ messages in thread
* RE: [PATCH] Make get-bytevector-all suspendable.
2024-06-01 9:11 ` Ludovic Courtès
@ 2024-06-01 9:40 ` Maxime Devos
2024-06-16 13:07 ` Ludovic Courtès
2024-06-01 9:44 ` Maxime Devos
1 sibling, 1 reply; 6+ messages in thread
From: Maxime Devos @ 2024-06-01 9:40 UTC (permalink / raw)
To: Ludovic Courtès, Christopher Baines; +Cc: guile-devel@gnu.org
[-- Attachment #1: Type: text/plain, Size: 1807 bytes --]
>Given that ‘get-bytevector-n!’ already has a variant in
suspendable-ports.scm, my preference would be to rewrite
‘get-bytevector-all’ in Scheme (patch attached). That way, it would
naturally be suspendable. (It’s also in line with the general strategy
of moving things to Scheme.)
It currently is difficult to write a correct implementation of get-bytevector-all in pure Scheme, because ‘get-bytevector-all’ needs to return a _fresh_ bytevector and could return twice (e.g. in case of system-async-mark + call-with-prompt shenanigans). I think the proposed implementation is incorrect in this way.
Could be resolved with call-with-blocked-asyncs, but given the title of the patch, that’s against the point.
However, given the sequential and stateful nature of ports, I don’t think it’s useful to implement support for these ‘return twice’ situations, so I think it would be best to simply adjust the documentation to disallow this situation.
Since it is not unique to get-bytevector-all, maybe it could be mentioned in https://www.gnu.org/software/guile/manual/html_node/Ports.html that it isn’t allowed to put any non-pure operation (where we include allocations as a side-effect, e.g. for open-input-bytevector etc, ) related to ports in such a situations?
+(define (get-bytevector-all port)
+ (define %initial-length 4096)
+
+ (let read-loop ((total 0)
+ (result-length %initial-length)
+ (result (make-bytevector %initial-length)))
I sense a lack of tests for the test suite ...
Also, to avoid avoidable allocations and copying, it would be nice if %initial-length could be overridden (optional #:initial-length keyword argument), in case the user has a better guess available.
Best regards,
Maxime Devos
[-- Attachment #2: Type: text/html, Size: 3664 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] Make get-bytevector-all suspendable.
2024-06-01 9:11 ` Ludovic Courtès
2024-06-01 9:40 ` Maxime Devos
@ 2024-06-01 9:44 ` Maxime Devos
1 sibling, 0 replies; 6+ messages in thread
From: Maxime Devos @ 2024-06-01 9:44 UTC (permalink / raw)
To: Ludovic Courtès, Christopher Baines; +Cc: guile-devel@gnu.org
[-- Attachment #1: Type: text/plain, Size: 187 bytes --]
Adding to my previous response:
Most (all?) tests for suspendable/non-suspendable are shared, and run in both cases, so probably the suspendable get-bytevector-all already has tests.
[-- Attachment #2: Type: text/html, Size: 1237 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Make get-bytevector-all suspendable.
2024-06-01 9:40 ` Maxime Devos
@ 2024-06-16 13:07 ` Ludovic Courtès
2024-06-16 13:23 ` Maxime Devos
0 siblings, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2024-06-16 13:07 UTC (permalink / raw)
To: Maxime Devos; +Cc: guile-devel
Hi Maxime,
Maxime Devos <maximedevos@telenet.be> skribis:
> It currently is difficult to write a correct implementation of get-bytevector-all in pure Scheme, because ‘get-bytevector-all’ needs to return a _fresh_ bytevector and could return twice (e.g. in case of system-async-mark + call-with-prompt shenanigans). I think the proposed implementation is incorrect in this way.
Hmm I don’t see how it could return twice. If an async runs while
‘get-bytevector-all’ is executed, it cannot cause ‘get-bytevector-all’
to abort to a prompt. I think we’re fine, no?
> +(define (get-bytevector-all port)
> + (define %initial-length 4096)
> +
> + (let read-loop ((total 0)
> + (result-length %initial-length)
> + (result (make-bytevector %initial-length)))
>
> I sense a lack of tests for the test suite ...
> Also, to avoid avoidable allocations and copying, it would be nice if %initial-length could be overridden (optional #:initial-length keyword argument), in case the user has a better guess available.
Yeah, but since this interface is specified in R6RS, and since we’re
just porting the existing C implementation to Scheme, I’d rather not add
such a keyword argument. We could provide an extension separately
though, as is done for (rnrs bytevectors gnu) for instance.
Regarding tests, you’re right: this is already covered.
I pushed this as 461ff313fa478d207a7668595e9d976a2ace9770 together with
a NEWS entry.
Thanks for your feedback!
Ludo’.
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] Make get-bytevector-all suspendable.
2024-06-16 13:07 ` Ludovic Courtès
@ 2024-06-16 13:23 ` Maxime Devos
0 siblings, 0 replies; 6+ messages in thread
From: Maxime Devos @ 2024-06-16 13:23 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel@gnu.org
[-- Attachment #1: Type: text/plain, Size: 1513 bytes --]
>> It currently is difficult to write a correct implementation of get-bytevector-all in pure Scheme, because ‘get-bytevector-all’ needs to return a _fresh_ bytevector and could return twice (e.g. in case of system-async-mark + call-with-prompt shenanigans). I think the proposed implementation is incorrect in this way.
>Hmm I don’t see how it could return twice. If an async runs while
>‘get-bytevector-all’ is executed, it cannot cause ‘get-bytevector-all’
>to abort to a prompt. I think we’re fine, no?
I did not say that ‘get-bytevector-all’ aborts to a prompt. Rather, the async itself does the abort-to-prompt, and since the async is run inside the get-bytevector-all, as a result get-bytevector-all could return twice (depending on what the handler of call-with-prompt is doing with the delimited continuation).
I’m pretty sure it can do this, that’s how Fibers works (to be clear I’m referring to the abort-to-prompt from an async, not the return twice, Fibers doesn’t do return twice things) -- non-cooperative scheduling is implemented by aborting from an async.
Example (assuming bad/good timing):
Thread #1:
;; Consider the situation where the handler invokes the delimited continuation twice.
(call-with-prompt
[tag + handler things]
(lambda () (get-bytevector-all [...]))
[tag + handler things])
Thread #2:
(system-async-mark
(lambda ()
(abort-to-prompt [tag + more arguments]))
[Thread #1]))
Best regards,
Maxime Devos.
[-- Attachment #2: Type: text/html, Size: 3162 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-06-16 13:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-20 14:01 [PATCH] Make get-bytevector-all suspendable Christopher Baines
2024-06-01 9:11 ` Ludovic Courtès
2024-06-01 9:40 ` Maxime Devos
2024-06-16 13:07 ` Ludovic Courtès
2024-06-16 13:23 ` Maxime Devos
2024-06-01 9:44 ` Maxime Devos
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).