* [PATCH] Make get-bytevector-all suspendable. @ 2023-07-20 14:01 Christopher Baines 2024-06-01 9:11 ` Ludovic Courtès 0 siblings, 1 reply; 6+ messages in thread From: Christopher Baines @ 2023-07-20 14:01 UTC (permalink / raw) To: guile-devel I'm looking at this since it's used in (web response) read-response-body. * module/ice-9/suspendable-ports.scm (get-bytevector-all): New procedure. (port-bindings): Add it. --- module/ice-9/suspendable-ports.scm | 36 ++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/module/ice-9/suspendable-ports.scm b/module/ice-9/suspendable-ports.scm index 9fac1df62..e5b3de982 100644 --- a/module/ice-9/suspendable-ports.scm +++ b/module/ice-9/suspendable-ports.scm @@ -50,6 +50,7 @@ (define-module (ice-9 suspendable-ports) #:use-module (rnrs bytevectors) + #:use-module (rnrs bytevectors gnu) #:use-module (ice-9 ports internal) #:use-module (ice-9 match) #:export (current-read-waiter @@ -342,6 +343,40 @@ (set-port-buffer-cur! buf (+ cur transfer-size)) transfer-size)))))) +(define (get-bytevector-all port) + (define %initial-length 4096) + + (let read-loop ((total 0) + (result-length %initial-length) + (result (make-bytevector %initial-length))) + (call-with-values (lambda () (fill-input port 1 'binary)) + (lambda (buf cur buffered) + (if (zero? buffered) + (begin + (set-port-buffer-has-eof?! buf #f) + (if (= total 0) + the-eof-object + (bytevector-slice result 0 total))) + (let* ((new-total (+ total buffered)) + (new-result-length + (let loop ((new-result-length result-length)) + (if (< new-total new-result-length) + new-result-length + (loop (* 2 new-result-length))))) + (new-result + (if (= new-result-length result-length) + result + (let ((new-result (make-bytevector new-result-length))) + (bytevector-copy! result 0 new-result 0 total) + new-result)))) + (bytevector-copy! (port-buffer-bytevector buf) cur + new-result total + buffered) + (set-port-buffer-cur! buf (+ cur buffered)) + (read-loop new-total + new-result-length + new-result))))))) + (define (put-u8 port byte) (let* ((buf (port-write-buffer port)) (bv (port-buffer-bytevector buf)) @@ -754,6 +789,7 @@ ((ice-9 binary-ports) get-u8 lookahead-u8 get-bytevector-n get-bytevector-n! get-bytevector-some get-bytevector-some! + get-bytevector-all put-u8 put-bytevector) ((ice-9 textual-ports) put-char put-string) -- 2.41.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] Make get-bytevector-all suspendable. 2023-07-20 14:01 [PATCH] Make get-bytevector-all suspendable Christopher Baines @ 2024-06-01 9:11 ` Ludovic Courtès 2024-06-01 9:40 ` Maxime Devos 2024-06-01 9:44 ` Maxime Devos 0 siblings, 2 replies; 6+ messages in thread From: Ludovic Courtès @ 2024-06-01 9:11 UTC (permalink / raw) To: Christopher Baines; +Cc: guile-devel [-- Attachment #1: Type: text/plain, Size: 718 bytes --] Hi Chris, Christopher Baines <mail@cbaines.net> skribis: > I'm looking at this since it's used in (web response) > read-response-body. > > * module/ice-9/suspendable-ports.scm (get-bytevector-all): New > procedure. > (port-bindings): Add it. Given that ‘get-bytevector-n!’ already has a variant in suspendable-ports.scm, my preference would be to rewrite ‘get-bytevector-all’ in Scheme (patch attached). That way, it would naturally be suspendable. (It’s also in line with the general strategy of moving things to Scheme.) I don’t expect significant performance difference compared to the C implementation since that is dominated by allocations and I/O. Thoughts? Ludo’. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-patch, Size: 4277 bytes --] diff --git a/libguile/r6rs-ports.c b/libguile/r6rs-ports.c index 7c51bf617..ffa1e1b2b 100644 --- a/libguile/r6rs-ports.c +++ b/libguile/r6rs-ports.c @@ -1,4 +1,4 @@ -/* Copyright 2009-2011,2013-2015,2018-2019,2023 +/* Copyright 2009-2011,2013-2015,2018-2019,2023,2024 Free Software Foundation, Inc. This file is part of Guile. @@ -393,58 +393,23 @@ SCM_DEFINE (scm_get_bytevector_some_x, "get-bytevector-some!", 4, 0, 0, } #undef FUNC_NAME -SCM_DEFINE (scm_get_bytevector_all, "get-bytevector-all", 1, 0, 0, - (SCM port), - "Read from @var{port}, blocking as necessary, until " - "the end-of-file is reached. Return either " - "a new bytevector containing the data read or the " - "end-of-file object (if no data were available).") -#define FUNC_NAME s_scm_get_bytevector_all -{ - SCM result; - size_t c_len, c_count; - size_t c_read, c_total; - - SCM_VALIDATE_BINARY_INPUT_PORT (1, port); +static SCM get_bytevector_all_var; - c_len = c_count = 4096; - result = scm_c_make_bytevector (c_count); - c_total = c_read = 0; - - do - { - if (c_read > c_len - c_total) - { - /* Grow the bytevector. */ - SCM prev = result; - - if (INT_ADD_OVERFLOW (c_len, c_len)) - scm_num_overflow (FUNC_NAME); - - result = scm_c_make_bytevector (c_len * 2); - memcpy (SCM_BYTEVECTOR_CONTENTS (result), - SCM_BYTEVECTOR_CONTENTS (prev), - c_total); - c_count = c_len; - c_len *= 2; - } - - /* `scm_c_read ()' blocks until C_COUNT bytes are available or EOF is - reached. */ - c_read = scm_c_read_bytes (port, result, c_total, c_count); - c_total += c_read, c_count -= c_read; - } - while (c_count == 0); - - if (c_total == 0) - return SCM_EOF_VAL; +static void +init_bytevector_io_vars (void) +{ + get_bytevector_all_var = + scm_c_public_lookup ("ice-9 binary-port", "get-bytevector-all"); +} - if (c_len > c_total) - return scm_c_shrink_bytevector (result, c_total); +SCM +scm_get_bytevector_all (SCM port) +{ + static scm_i_pthread_once_t once = SCM_I_PTHREAD_ONCE_INIT; + scm_i_pthread_once (&once, init_bytevector_io_vars); - return result; + return scm_call_1 (scm_variable_ref (get_bytevector_all_var), port); } -#undef FUNC_NAME \f diff --git a/module/ice-9/binary-ports.scm b/module/ice-9/binary-ports.scm index b7eddc93d..864d9ef9a 100644 --- a/module/ice-9/binary-ports.scm +++ b/module/ice-9/binary-ports.scm @@ -1,5 +1,5 @@ ;;; binary-ports.scm --- Binary IO on ports -;;; Copyright (C) 2009-2011,2013,2016,2019,2021,2023 Free Software Foundation, Inc. +;;; Copyright (C) 2009-2011,2013,2016,2019,2021,2023,2024 Free Software Foundation, Inc. ;;; ;;; This library is free software: you can redistribute it and/or modify ;;; it under the terms of the GNU Lesser General Public License as @@ -27,6 +27,7 @@ (define-module (ice-9 binary-ports) #:use-module (rnrs bytevectors) + #:autoload (rnrs bytevectors gnu) (bytevector-slice) #:use-module (ice-9 match) #:use-module (ice-9 custom-ports) #:export (eof-object @@ -180,3 +181,29 @@ bytevector composed of the bytes written into the port is returned." ;; FIXME: Instead default to current encoding, if ;; someone reads text from this port. #:encoding 'ISO-8859-1 #:conversion-strategy 'error)) + +\f +;;; +;;; Binary input. +;;; + +(define (get-bytevector-all port) + "Read from @var{port}, blocking as necessary, until +the end-of-file is reached. Return either a new bytevector containing +the data read or the end-of-file object (if no data were available)." + (define initial-capacity 4096) + + (let loop ((bv (make-bytevector initial-capacity)) + (capacity initial-capacity) + (size 0)) + (match (get-bytevector-n! port bv size (- capacity size)) + ((? eof-object?) + (bytevector-slice bv 0 size)) + (read + (let ((size (+ read size))) + (if (= capacity size) + (let* ((capacity (* capacity 2)) + (new (make-bytevector capacity))) + (bytevector-copy! bv 0 new 0 size) + (loop new capacity size)) + (loop bv capacity size))))))) ^ permalink raw reply related [flat|nested] 6+ messages in thread
* RE: [PATCH] Make get-bytevector-all suspendable. 2024-06-01 9:11 ` Ludovic Courtès @ 2024-06-01 9:40 ` Maxime Devos 2024-06-16 13:07 ` Ludovic Courtès 2024-06-01 9:44 ` Maxime Devos 1 sibling, 1 reply; 6+ messages in thread From: Maxime Devos @ 2024-06-01 9:40 UTC (permalink / raw) To: Ludovic Courtès, Christopher Baines; +Cc: guile-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 1807 bytes --] >Given that ‘get-bytevector-n!’ already has a variant in suspendable-ports.scm, my preference would be to rewrite ‘get-bytevector-all’ in Scheme (patch attached). That way, it would naturally be suspendable. (It’s also in line with the general strategy of moving things to Scheme.) It currently is difficult to write a correct implementation of get-bytevector-all in pure Scheme, because ‘get-bytevector-all’ needs to return a _fresh_ bytevector and could return twice (e.g. in case of system-async-mark + call-with-prompt shenanigans). I think the proposed implementation is incorrect in this way. Could be resolved with call-with-blocked-asyncs, but given the title of the patch, that’s against the point. However, given the sequential and stateful nature of ports, I don’t think it’s useful to implement support for these ‘return twice’ situations, so I think it would be best to simply adjust the documentation to disallow this situation. Since it is not unique to get-bytevector-all, maybe it could be mentioned in https://www.gnu.org/software/guile/manual/html_node/Ports.html that it isn’t allowed to put any non-pure operation (where we include allocations as a side-effect, e.g. for open-input-bytevector etc, ) related to ports in such a situations? +(define (get-bytevector-all port) + (define %initial-length 4096) + + (let read-loop ((total 0) + (result-length %initial-length) + (result (make-bytevector %initial-length))) I sense a lack of tests for the test suite ... Also, to avoid avoidable allocations and copying, it would be nice if %initial-length could be overridden (optional #:initial-length keyword argument), in case the user has a better guess available. Best regards, Maxime Devos [-- Attachment #2: Type: text/html, Size: 3664 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Make get-bytevector-all suspendable. 2024-06-01 9:40 ` Maxime Devos @ 2024-06-16 13:07 ` Ludovic Courtès 2024-06-16 13:23 ` Maxime Devos 0 siblings, 1 reply; 6+ messages in thread From: Ludovic Courtès @ 2024-06-16 13:07 UTC (permalink / raw) To: Maxime Devos; +Cc: guile-devel Hi Maxime, Maxime Devos <maximedevos@telenet.be> skribis: > It currently is difficult to write a correct implementation of get-bytevector-all in pure Scheme, because ‘get-bytevector-all’ needs to return a _fresh_ bytevector and could return twice (e.g. in case of system-async-mark + call-with-prompt shenanigans). I think the proposed implementation is incorrect in this way. Hmm I don’t see how it could return twice. If an async runs while ‘get-bytevector-all’ is executed, it cannot cause ‘get-bytevector-all’ to abort to a prompt. I think we’re fine, no? > +(define (get-bytevector-all port) > + (define %initial-length 4096) > + > + (let read-loop ((total 0) > + (result-length %initial-length) > + (result (make-bytevector %initial-length))) > > I sense a lack of tests for the test suite ... > Also, to avoid avoidable allocations and copying, it would be nice if %initial-length could be overridden (optional #:initial-length keyword argument), in case the user has a better guess available. Yeah, but since this interface is specified in R6RS, and since we’re just porting the existing C implementation to Scheme, I’d rather not add such a keyword argument. We could provide an extension separately though, as is done for (rnrs bytevectors gnu) for instance. Regarding tests, you’re right: this is already covered. I pushed this as 461ff313fa478d207a7668595e9d976a2ace9770 together with a NEWS entry. Thanks for your feedback! Ludo’. ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] Make get-bytevector-all suspendable. 2024-06-16 13:07 ` Ludovic Courtès @ 2024-06-16 13:23 ` Maxime Devos 0 siblings, 0 replies; 6+ messages in thread From: Maxime Devos @ 2024-06-16 13:23 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guile-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 1513 bytes --] >> It currently is difficult to write a correct implementation of get-bytevector-all in pure Scheme, because ‘get-bytevector-all’ needs to return a _fresh_ bytevector and could return twice (e.g. in case of system-async-mark + call-with-prompt shenanigans). I think the proposed implementation is incorrect in this way. >Hmm I don’t see how it could return twice. If an async runs while >‘get-bytevector-all’ is executed, it cannot cause ‘get-bytevector-all’ >to abort to a prompt. I think we’re fine, no? I did not say that ‘get-bytevector-all’ aborts to a prompt. Rather, the async itself does the abort-to-prompt, and since the async is run inside the get-bytevector-all, as a result get-bytevector-all could return twice (depending on what the handler of call-with-prompt is doing with the delimited continuation). I’m pretty sure it can do this, that’s how Fibers works (to be clear I’m referring to the abort-to-prompt from an async, not the return twice, Fibers doesn’t do return twice things) -- non-cooperative scheduling is implemented by aborting from an async. Example (assuming bad/good timing): Thread #1: ;; Consider the situation where the handler invokes the delimited continuation twice. (call-with-prompt [tag + handler things] (lambda () (get-bytevector-all [...])) [tag + handler things]) Thread #2: (system-async-mark (lambda () (abort-to-prompt [tag + more arguments])) [Thread #1])) Best regards, Maxime Devos. [-- Attachment #2: Type: text/html, Size: 3162 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] Make get-bytevector-all suspendable. 2024-06-01 9:11 ` Ludovic Courtès 2024-06-01 9:40 ` Maxime Devos @ 2024-06-01 9:44 ` Maxime Devos 1 sibling, 0 replies; 6+ messages in thread From: Maxime Devos @ 2024-06-01 9:44 UTC (permalink / raw) To: Ludovic Courtès, Christopher Baines; +Cc: guile-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 187 bytes --] Adding to my previous response: Most (all?) tests for suspendable/non-suspendable are shared, and run in both cases, so probably the suspendable get-bytevector-all already has tests. [-- Attachment #2: Type: text/html, Size: 1237 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-06-16 13:23 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-07-20 14:01 [PATCH] Make get-bytevector-all suspendable Christopher Baines 2024-06-01 9:11 ` Ludovic Courtès 2024-06-01 9:40 ` Maxime Devos 2024-06-16 13:07 ` Ludovic Courtès 2024-06-16 13:23 ` Maxime Devos 2024-06-01 9:44 ` Maxime Devos
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).