From: "Neil Jerram" <neiljerram@googlemail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guile-devel@gnu.org
Subject: Re: [PATCH] Add a `read' method for ports
Date: Fri, 13 Jun 2008 01:18:13 +0100 [thread overview]
Message-ID: <49dd78620806121718o3cb1ae2dta673ef86cee79382@mail.gmail.com> (raw)
In-Reply-To: <87y75bcceo.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 3250 bytes --]
Hi Ludovic,
I hope you don't mind if I continue to pursue this a little...
2008/6/12 Ludovic Courtès <ludo@gnu.org>:
> Hi Neil,
>
> "Neil Jerram" <neiljerram@googlemail.com> writes:
>
>> So, one way of solving this would be for an `unbuffered' port
>> temporarily to get a buffer of the right size, and then to use the
>> fill_input logic as it currently stands. That would achieve your
>> objective of mapping onto a single N-byte read(2) call.
>
> Surely this would work, but I can think of several issues:
>
> 1. Thread safety. OK, ports are not thread-safe and it'd take more
> than this to make them thread-safe, but passing arguments
> explicitly (as the BUFFER and SIZE arguments of `read') is a step
> forward, while mutating the buffer pointer and size in the port is
> a step backward.
Agreed in theory, but I don't think it's a significant step backwards,
because ports are fundamentally not thread-safe (as you've already
said above), and already have lots of implicit arguments.
> 2. Performance. Another case where the `read' method improves on
> performance: when more data is requested than can fit in the port's
> buffer, the port buffer is completely bypassed. This allows the
> data to be read in one shot, avoiding memcpys from the port's
> buffer to the caller's buffer.
Agreed. Yesterday I was worried about possible confusion from
bypassing a port's read buffer at a time when the read buffer has some
data in it. But in fact this is not a problem, because
scm_fill_input() should be (and in practice is) only called when the
port's read buffer is empty.
> 3. Aesthetics. The `read' method looks more natural to me, since it's
> just like `read(2)'. Also, as mentioned earlier, `read' methods
> don't have to be aware of the port's internal buffer: they just
> need to fill in the caller-supplied buffer. When `scm_fill_input ()'
> is called, it just invokes `read', passing it the port's internal
> buffer and using its return value to update `pt->read_end'.
Aesthetics are always tricky. I accept your point here, but against
that we have to count
- the cost of adding a new method to libguile's port definition
- the benefit of (some) existing fill_input code being able to work
with caller-supplied buffers automatically. (I say "some", because
some fill_input implementations may make assumptions about the read
buffer size and so not automatically take advantage of a larger
caller-supplied buffer. But at least the important fport case will
work automatically.)
>> But that advantage is also (in some cases) the disadvantage of having
>> to do an additional memcpy of the data.
>
> There's no additional memcpy, as mentioned above.
Indeed, my mistake.
Based on all this, I'd like to update my suggestion so that
- scm_fill_input_buf always uses the caller's buffer
- the remaining scm_fill_input_buf logic is inlined into scm_c_read
(since that is the only place that uses scm_fill_input_buf, and
inlining it permits some further simplifications).
New patch (smaller and simpler) is attached; please let me know if
this sways your opinion at all!
Neil
[-- Attachment #2: scm_fill_input_buf.patch --]
[-- Type: application/octet-stream, Size: 3643 bytes --]
diff --git a/libguile/ports.c b/libguile/ports.c
index b97c826..f9f7a2d 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -1042,6 +1042,8 @@ scm_c_read (SCM port, void *buffer, size_t size)
{
scm_t_port *pt;
size_t n_read = 0, n_available;
+ unsigned char *saved_buf;
+ size_t saved_buf_size;
SCM_VALIDATE_OPINPORT (1, port);
@@ -1052,35 +1054,43 @@ scm_c_read (SCM port, void *buffer, size_t size)
if (pt->rw_random)
pt->rw_active = SCM_PORT_READ;
- if (SCM_READ_BUFFER_EMPTY_P (pt))
- {
- if (scm_fill_input (port) == EOF)
- return 0;
- }
-
- n_available = pt->read_end - pt->read_pos;
-
- while (n_available < size)
+ /* Take bytes first from the port's read buffer. */
+ if (pt->read_pos < pt->read_end)
{
+ n_available = min (size, pt->read_end - pt->read_pos);
memcpy (buffer, pt->read_pos, n_available);
buffer = (char *) buffer + n_available;
pt->read_pos += n_available;
n_read += n_available;
-
- if (SCM_READ_BUFFER_EMPTY_P (pt))
- {
- if (scm_fill_input (port) == EOF)
- return n_read;
- }
-
size -= n_available;
- n_available = pt->read_end - pt->read_pos;
}
- memcpy (buffer, pt->read_pos, size);
- pt->read_pos += size;
+ /* Now we will call scm_fill_input repeatedly until we have read the
+ requested number of bytes. (Note that a single scm_fill_input
+ call does not guarantee to fill the whole of the port's read
+ buffer.) For these calls, since we already have a buffer here to
+ read into, we bypass the port's own read buffer (if it has one),
+ by saving it off and modifying the port structure to point to our
+ own buffer. */
+ saved_buf = pt->read_buf;
+ saved_buf_size = pt->read_buf_size;
+ pt->read_pos = pt->read_buf = pt->read_end = buffer;
+ pt->read_buf_size = size;
+
+ /* Call scm_fill_input until we have all the bytes that we need, or
+ we hit EOF. */
+ while (pt->read_buf_size && (scm_fill_input (port) != EOF))
+ {
+ pt->read_buf_size -= (pt->read_end - pt->read_pos);
+ pt->read_pos = pt->read_buf = pt->read_end;
+ }
+ n_read += pt->read_buf - (unsigned char *) buffer;
+
+ /* Reinstate the port's normal buffer. */
+ pt->read_pos = pt->read_buf = pt->read_end = saved_buf;
+ pt->read_buf_size = saved_buf_size;
- return n_read + size;
+ return n_read;
}
#undef FUNC_NAME
diff --git a/libguile/srfi-4.c b/libguile/srfi-4.c
index 7d22f8b..a01f86e 100644
--- a/libguile/srfi-4.c
+++ b/libguile/srfi-4.c
@@ -886,38 +886,11 @@ SCM_DEFINE (scm_uniform_vector_read_x, "uniform-vector-read!", 1, 3, 0,
if (SCM_NIMP (port_or_fd))
{
- scm_t_port *pt = SCM_PTAB_ENTRY (port_or_fd);
-
- if (pt->rw_active == SCM_PORT_WRITE)
- scm_flush (port_or_fd);
-
ans = cend - cstart;
- while (remaining > 0)
- {
- if (pt->read_pos < pt->read_end)
- {
- size_t to_copy = min (pt->read_end - pt->read_pos,
- remaining);
-
- memcpy (base + off, pt->read_pos, to_copy);
- pt->read_pos += to_copy;
- remaining -= to_copy;
- off += to_copy;
- }
- else
- {
- if (scm_fill_input (port_or_fd) == EOF)
- {
- if (remaining % sz != 0)
- SCM_MISC_ERROR ("unexpected EOF", SCM_EOL);
- ans -= remaining / sz;
- break;
- }
- }
- }
-
- if (pt->rw_random)
- pt->rw_active = SCM_PORT_READ;
+ remaining -= scm_c_read (port_or_fd, base + off, remaining);
+ if (remaining % sz != 0)
+ SCM_MISC_ERROR ("unexpected EOF", SCM_EOL);
+ ans -= remaining / sz;
}
else /* file descriptor. */
{
next prev parent reply other threads:[~2008-06-13 0:18 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-01 18:58 [PATCH] Add a `read' method for ports Ludovic Courtès
2008-06-08 21:01 ` Ludovic Courtès
2008-06-09 17:01 ` Neil Jerram
2008-06-09 20:59 ` Ludovic Courtès
2008-06-11 21:38 ` Neil Jerram
2008-06-12 0:19 ` Neil Jerram
2008-06-12 7:58 ` Ludovic Courtès
2008-06-13 0:18 ` Neil Jerram [this message]
2008-06-24 20:50 ` Ludovic Courtès
2008-07-15 19:24 ` Ludovic Courtès
2008-07-15 20:05 ` Neil Jerram
2008-07-15 22:08 ` Neil Jerram
2008-07-16 21:41 ` Ludovic Courtès
2008-09-12 20:09 ` Ludovic Courtès
2008-09-14 23:08 ` Neil Jerram
2008-09-15 1:01 ` Neil Jerram
2008-09-15 7:47 ` Ludovic Courtès
2008-09-15 7:44 ` Ludovic Courtès
2008-09-15 19:18 ` Neil Jerram
2008-09-15 19:20 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49dd78620806121718o3cb1ae2dta673ef86cee79382@mail.gmail.com \
--to=neiljerram@googlemail.com \
--cc=guile-devel@gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).