2008/6/24 Ludovic Courtès <ludo@gnu.org>:
> Hi Neil,
>
> Sorry for the delay, I was off-line last week.

No problem!

> "Neil Jerram" <neiljerram@googlemail.com> writes:
>
>> Agreed.  Yesterday I was worried about possible confusion from
>> bypassing a port's read buffer at a time when the read buffer has some
>> data in it.  But in fact this is not a problem, because
>> scm_fill_input() should be (and in practice is) only called when the
>> port's read buffer is empty.
>
> Right.  Putting an assertion in there to make sure can't hurt.

OK, done.

> These arguments make a lot of sense in 1.8.  In the longer run, though,
> I'd prefer to see `fill_input' superseded by a `read' method akin to
> what I proposed.  What do you think?

To be honest, I don't yet see a clear need for it.  If there is no
drawback to what we have now, and there would be a compatibility cost
to introducing a new read method, then why do that?

>> Based on all this, I'd like to update my suggestion so that
>>
>> - scm_fill_input_buf always uses the caller's buffer
>>
>> - the remaining scm_fill_input_buf logic is inlined into scm_c_read
>> (since that is the only place that uses scm_fill_input_buf, and
>> inlining it permits some further simplifications).
>
> OK.  The issue is that if the caller-supplied buffer is smaller than the
> port's buffer, we end up reading less data than we'd otherwise do.  I
> think this scenario should be addressed (by comparing `read_buf_size'
> and `size'), as was the case in my proposal.

True in theory, but I feel very confident that this will never matter
in practice, and hence that we don't need to have code for this.

Confidence is based on experience from my day job, which is in network
protocols - where buffered reads are a very common operation.  On the
other hand, it is possible I've missed some other application where
different patterns/rules apply, so please let me know if you have
specific counterexamples in mind.

>> +  /* Now we will call scm_fill_input repeatedly until we have read the
>> +     requested number of bytes.  (Note that a single scm_fill_input
>> +     call does not guarantee to fill the whole of the port's read
>> +     buffer.)  For these calls, since we already have a buffer here to
>> +     read into, we bypass the port's own read buffer (if it has one),
>> +     by saving it off and modifying the port structure to point to our
>> +     own buffer. */
>> +  saved_buf = pt->read_buf;
>> +  saved_buf_size = pt->read_buf_size;
>> +  pt->read_pos = pt->read_buf = pt->read_end = buffer;
>> +  pt->read_buf_size = size;
>
> `fill_input' methods can raise an exception, as is the case with soft
> parts.  Thus, this code should be protected by a `dynwind' that
> restores the port's buffer and size.

Good point.  I've added this.

>> +      remaining -= scm_c_read (port_or_fd, base + off, remaining);
>> +      if (remaining % sz != 0)
>> +        SCM_MISC_ERROR ("unexpected EOF", SCM_EOL);
>> +      ans -= remaining / sz;
>
> Likewise, `scm_c_read ()' might raise an exception, so we may need a
> `dynwind' here (my original patch for `uniform-vector-read!' did that).

This is covered by the dynwind inside scm_c_read, isn't it?

Updated patch is attached.  As well as taking a look, could you check
that it really improves performance in the cases that motivated this?
I tried running the read.bm benchmarks, but in those cases it didn't
seem to make much difference; is that expected?

Regards,
        Neil