unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: "Neil Jerram" <neiljerram@googlemail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guile-devel@gnu.org
Subject: Re: [PATCH] Add a `read' method for ports
Date: Fri, 13 Jun 2008 01:18:13 +0100	[thread overview]
Message-ID: <49dd78620806121718o3cb1ae2dta673ef86cee79382@mail.gmail.com> (raw)
In-Reply-To: <87y75bcceo.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 3250 bytes --]

Hi Ludovic,

I hope you don't mind if I continue to pursue this a little...

2008/6/12 Ludovic Courtès <ludo@gnu.org>:
> Hi Neil,
>
> "Neil Jerram" <neiljerram@googlemail.com> writes:
>
>> So, one way of solving this would be for an `unbuffered' port
>> temporarily to get a buffer of the right size, and then to use the
>> fill_input logic as it currently stands.  That would achieve your
>> objective of mapping onto a single N-byte read(2) call.
>
> Surely this would work, but I can think of several issues:
>
>  1. Thread safety.  OK, ports are not thread-safe and it'd take more
>     than this to make them thread-safe, but passing arguments
>     explicitly (as the BUFFER and SIZE arguments of `read') is a step
>     forward, while mutating the buffer pointer and size in the port is
>     a step backward.

Agreed in theory, but I don't think it's a significant step backwards,
because ports are fundamentally not thread-safe (as you've already
said above), and already have lots of implicit arguments.

>  2. Performance.  Another case where the `read' method improves on
>     performance: when more data is requested than can fit in the port's
>     buffer, the port buffer is completely bypassed.  This allows the
>     data to be read in one shot, avoiding memcpys from the port's
>     buffer to the caller's buffer.

Agreed.  Yesterday I was worried about possible confusion from
bypassing a port's read buffer at a time when the read buffer has some
data in it.  But in fact this is not a problem, because
scm_fill_input() should be (and in practice is) only called when the
port's read buffer is empty.

>  3. Aesthetics.  The `read' method looks more natural to me, since it's
>     just like `read(2)'.  Also, as mentioned earlier, `read' methods
>     don't have to be aware of the port's internal buffer: they just
>     need to fill in the caller-supplied buffer.  When `scm_fill_input ()'
>     is called, it just invokes `read', passing it the port's internal
>     buffer and using its return value to update `pt->read_end'.

Aesthetics are always tricky.  I accept your point here, but against
that we have to count

- the cost of adding a new method to libguile's port definition

- the benefit of (some) existing fill_input code being able to work
with caller-supplied buffers automatically.  (I say "some", because
some fill_input implementations may make assumptions about the read
buffer size and so not automatically take advantage of a larger
caller-supplied buffer.  But at least the important fport case will
work automatically.)

>> But that advantage is also (in some cases) the disadvantage of having
>> to do an additional memcpy of the data.
>
> There's no additional memcpy, as mentioned above.

Indeed, my mistake.

Based on all this, I'd like to update my suggestion so that

- scm_fill_input_buf always uses the caller's buffer

- the remaining scm_fill_input_buf logic is inlined into scm_c_read
(since that is the only place that uses scm_fill_input_buf, and
inlining it permits some further simplifications).

New patch (smaller and simpler) is attached; please let me know if
this sways your opinion at all!

         Neil

[-- Attachment #2: scm_fill_input_buf.patch --]
[-- Type: application/octet-stream, Size: 3643 bytes --]

diff --git a/libguile/ports.c b/libguile/ports.c
index b97c826..f9f7a2d 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -1042,6 +1042,8 @@ scm_c_read (SCM port, void *buffer, size_t size)
 {
   scm_t_port *pt;
   size_t n_read = 0, n_available;
+  unsigned char *saved_buf;
+  size_t saved_buf_size;
 
   SCM_VALIDATE_OPINPORT (1, port);
 
@@ -1052,35 +1054,43 @@ scm_c_read (SCM port, void *buffer, size_t size)
   if (pt->rw_random)
     pt->rw_active = SCM_PORT_READ;
 
-  if (SCM_READ_BUFFER_EMPTY_P (pt))
-    {
-      if (scm_fill_input (port) == EOF)
-	return 0;
-    }
-  
-  n_available = pt->read_end - pt->read_pos;
-  
-  while (n_available < size)
+  /* Take bytes first from the port's read buffer. */
+  if (pt->read_pos < pt->read_end)
     {
+      n_available = min (size, pt->read_end - pt->read_pos);
       memcpy (buffer, pt->read_pos, n_available);
       buffer = (char *) buffer + n_available;
       pt->read_pos += n_available;
       n_read += n_available;
-      
-      if (SCM_READ_BUFFER_EMPTY_P (pt))
-	{
-	  if (scm_fill_input (port) == EOF)
-	    return n_read;
-	}
-
       size -= n_available;
-      n_available = pt->read_end - pt->read_pos;
     }
 
-  memcpy (buffer, pt->read_pos, size);
-  pt->read_pos += size;
+  /* Now we will call scm_fill_input repeatedly until we have read the
+     requested number of bytes.  (Note that a single scm_fill_input
+     call does not guarantee to fill the whole of the port's read
+     buffer.)  For these calls, since we already have a buffer here to
+     read into, we bypass the port's own read buffer (if it has one),
+     by saving it off and modifying the port structure to point to our
+     own buffer. */
+  saved_buf = pt->read_buf;
+  saved_buf_size = pt->read_buf_size;
+  pt->read_pos = pt->read_buf = pt->read_end = buffer;
+  pt->read_buf_size = size;
+
+  /* Call scm_fill_input until we have all the bytes that we need, or
+     we hit EOF. */
+  while (pt->read_buf_size && (scm_fill_input (port) != EOF))
+    {
+      pt->read_buf_size -= (pt->read_end - pt->read_pos);
+      pt->read_pos = pt->read_buf = pt->read_end;
+    }
+  n_read += pt->read_buf - (unsigned char *) buffer;
+
+  /* Reinstate the port's normal buffer. */
+  pt->read_pos = pt->read_buf = pt->read_end = saved_buf;
+  pt->read_buf_size = saved_buf_size;
 
-  return n_read + size;
+  return n_read;
 }
 #undef FUNC_NAME
 
diff --git a/libguile/srfi-4.c b/libguile/srfi-4.c
index 7d22f8b..a01f86e 100644
--- a/libguile/srfi-4.c
+++ b/libguile/srfi-4.c
@@ -886,38 +886,11 @@ SCM_DEFINE (scm_uniform_vector_read_x, "uniform-vector-read!", 1, 3, 0,
 
   if (SCM_NIMP (port_or_fd))
     {
-      scm_t_port *pt = SCM_PTAB_ENTRY (port_or_fd);
-
-      if (pt->rw_active == SCM_PORT_WRITE)
-	scm_flush (port_or_fd);
-
       ans = cend - cstart;
-      while (remaining > 0)
-	{
-	  if (pt->read_pos < pt->read_end)
-	    {
-	      size_t to_copy = min (pt->read_end - pt->read_pos,
-				    remaining);
-	      
-	      memcpy (base + off, pt->read_pos, to_copy);
-	      pt->read_pos += to_copy;
-	      remaining -= to_copy;
-	      off += to_copy;
-	    }
-	  else
-	    {
-	      if (scm_fill_input (port_or_fd) == EOF)
-		{
-		  if (remaining % sz != 0)
-		    SCM_MISC_ERROR ("unexpected EOF", SCM_EOL);
-		  ans -= remaining / sz;
-		  break;
-		}
-	    }
-	}
-      
-      if (pt->rw_random)
-	pt->rw_active = SCM_PORT_READ;
+      remaining -= scm_c_read (port_or_fd, base + off, remaining);
+      if (remaining % sz != 0)
+        SCM_MISC_ERROR ("unexpected EOF", SCM_EOL);
+      ans -= remaining / sz;
     }
   else /* file descriptor.  */
     {

  reply	other threads:[~2008-06-13  0:18 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-01 18:58 [PATCH] Add a `read' method for ports Ludovic Courtès
2008-06-08 21:01 ` Ludovic Courtès
2008-06-09 17:01   ` Neil Jerram
2008-06-09 20:59     ` Ludovic Courtès
2008-06-11 21:38       ` Neil Jerram
2008-06-12  0:19         ` Neil Jerram
2008-06-12  7:58         ` Ludovic Courtès
2008-06-13  0:18           ` Neil Jerram [this message]
2008-06-24 20:50             ` Ludovic Courtès
2008-07-15 19:24               ` Ludovic Courtès
2008-07-15 20:05                 ` Neil Jerram
2008-07-15 22:08               ` Neil Jerram
2008-07-16 21:41                 ` Ludovic Courtès
2008-09-12 20:09                   ` Ludovic Courtès
2008-09-14 23:08                   ` Neil Jerram
2008-09-15  1:01                     ` Neil Jerram
2008-09-15  7:47                       ` Ludovic Courtès
2008-09-15  7:44                     ` Ludovic Courtès
2008-09-15 19:18                       ` Neil Jerram
2008-09-15 19:20                         ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49dd78620806121718o3cb1ae2dta673ef86cee79382@mail.gmail.com \
    --to=neiljerram@googlemail.com \
    --cc=guile-devel@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).