unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* oops? read/write vs type of length parameter
@ 2011-04-11  8:55 Jim Meyering
  2011-04-11  9:44 ` Eli Zaretskii
  0 siblings, 1 reply; 49+ messages in thread
From: Jim Meyering @ 2011-04-11  8:55 UTC (permalink / raw)
  To: Emacs development discussions; +Cc: Eli Zaretskii

Hi Eli,

In http://bzr.savannah.gnu.org/lh/emacs/trunk/revision/103883
you adjusted the signatures of emacs_read and emacs_write.

  - extern int emacs_read (int, char *, unsigned int);
  - extern int emacs_write (int, const char *, unsigned int);
  + extern ssize_t emacs_read (int, char *, ssize_t);
  + extern ssize_t emacs_write (int, const char *, ssize_t);

It's good to see that you corrected the return type to be wider
(ssize_t) and still signed, just like "int".

However, did you really intend to make the buffer length parameters signed?
I would have expected those to be of type size_t, not ssize_t.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11  8:55 oops? read/write vs type of length parameter Jim Meyering
@ 2011-04-11  9:44 ` Eli Zaretskii
  2011-04-11 11:08   ` Jim Meyering
  2011-04-11 11:40   ` Stephen J. Turnbull
  0 siblings, 2 replies; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-11  9:44 UTC (permalink / raw)
  To: Jim Meyering; +Cc: emacs-devel

> From: Jim Meyering <jim@meyering.net>
> Date: Mon, 11 Apr 2011 10:55:35 +0200
> Cc: Eli Zaretskii <eliz@elta.co.il>
> 
> In http://bzr.savannah.gnu.org/lh/emacs/trunk/revision/103883
> you adjusted the signatures of emacs_read and emacs_write.
> 
>   - extern int emacs_read (int, char *, unsigned int);
>   - extern int emacs_write (int, const char *, unsigned int);
>   + extern ssize_t emacs_read (int, char *, ssize_t);
>   + extern ssize_t emacs_write (int, const char *, ssize_t);

Yes, that's part of my on-going effort to allow editing of files
larger than 2GB.  With that revision, I can finally visit such a large
file, modify it, and save it to disk :-)

> It's good to see that you corrected the return type to be wider
> (ssize_t) and still signed, just like "int".

That's what its callers expect: the return value can be positive or
negative.

> However, did you really intend to make the buffer length parameters signed?
> I would have expected those to be of type size_t, not ssize_t.

We call these functions with an argument of type EMACS_INT, which can
be negative.  I don't want it to wind up as a large positive value
inside these functions, I'd rather the functions fail instead.  Note
that emacs_write is careful enough to check the sign of that argument,
and if we want a similar guard in emacs_read, we can easily add that.

Is there a problem with that?



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11  9:44 ` Eli Zaretskii
@ 2011-04-11 11:08   ` Jim Meyering
  2011-04-11 11:28     ` David Kastrup
  2011-04-11 11:52     ` Eli Zaretskii
  2011-04-11 11:40   ` Stephen J. Turnbull
  1 sibling, 2 replies; 49+ messages in thread
From: Jim Meyering @ 2011-04-11 11:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii wrote:

>> From: Jim Meyering <jim@meyering.net>
>> Date: Mon, 11 Apr 2011 10:55:35 +0200
>> Cc: Eli Zaretskii <eliz@elta.co.il>
>>
>> In http://bzr.savannah.gnu.org/lh/emacs/trunk/revision/103883
>> you adjusted the signatures of emacs_read and emacs_write.
>>
>>   - extern int emacs_read (int, char *, unsigned int);
>>   - extern int emacs_write (int, const char *, unsigned int);
>>   + extern ssize_t emacs_read (int, char *, ssize_t);
>>   + extern ssize_t emacs_write (int, const char *, ssize_t);
>
> Yes, that's part of my on-going effort to allow editing of files
> larger than 2GB.  With that revision, I can finally visit such a large
> file, modify it, and save it to disk :-)
>
>> It's good to see that you corrected the return type to be wider
>> (ssize_t) and still signed, just like "int".
>
> That's what its callers expect: the return value can be positive or
> negative.

As I said, that part is welcome.
It also makes these functions more like read and write.

>> However, did you really intend to make the buffer length parameters signed?
>> I would have expected those to be of type size_t, not ssize_t.
>
> We call these functions with an argument of type EMACS_INT, which can
> be negative.

With read and write, there is already a tension there,
since the return value (length or error indicator) is signed,
while the length parameter is unsigned, and hence slightly wider.

> I don't want it to wind up as a large positive value
> inside these functions, I'd rather the functions fail instead.  Note
> that emacs_write is careful enough to check the sign of that argument,
> and if we want a similar guard in emacs_read, we can easily add that.

Currently, emacs_write silently ignores an invalid buffer length,
treating it just like a length of 0.  It'd be better not to ignore
such an error.

IMHO, an interface that takes a logically unsigned parameter
should have an unsigned type.

I guess I'm biased towards least-surprise for developers, so I
think read- and write-like functions should accept a buffer length
argument of type size_t, to be consistent with read and write.
To try to protect against bugs by changing API to a signed type may
actually cause trouble when callers end up mixing/comparing their
newly signed (to accommodate the invented API) and unsigned lengths
from standard functions.

Another thing to keep in mind: on some older systems, trying to read
more than INT_MAX bytes in a single syscall will fail.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 11:08   ` Jim Meyering
@ 2011-04-11 11:28     ` David Kastrup
  2011-04-11 11:52     ` Eli Zaretskii
  1 sibling, 0 replies; 49+ messages in thread
From: David Kastrup @ 2011-04-11 11:28 UTC (permalink / raw)
  To: emacs-devel

Jim Meyering <jim@meyering.net> writes:

> Currently, emacs_write silently ignores an invalid buffer length,
> treating it just like a length of 0.  It'd be better not to ignore
> such an error.
>
> IMHO, an interface that takes a logically unsigned parameter
> should have an unsigned type.

In that case, there _are_ no invalid buffer lengths since any invalid
length is turned into a humongous valid one.

Since the starting value is from EMACS_INT...

> I guess I'm biased towards least-surprise for developers, so I think
> read- and write-like functions should accept a buffer length argument
> of type size_t, to be consistent with read and write.

I prefer surprise while reading the prototypes to surprise while
actually calling the function.  A reinterpretation of signed numbers
into the unsigned realm is not going to cause much joy here.

> To try to protect against bugs by changing API to a signed type may
> actually cause trouble when callers end up mixing/comparing their
> newly signed (to accommodate the invented API) and unsigned lengths
> from standard functions.

In C, (-1 > 10U) is true.  In practice, unsigned types cause much more
trouble and confusion than they avoid.

> Another thing to keep in mind: on some older systems, trying to read
> more than INT_MAX bytes in a single syscall will fail.

Are you trying to argue for or against your proposal?

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11  9:44 ` Eli Zaretskii
  2011-04-11 11:08   ` Jim Meyering
@ 2011-04-11 11:40   ` Stephen J. Turnbull
  2011-04-11 13:58     ` Eli Zaretskii
  1 sibling, 1 reply; 49+ messages in thread
From: Stephen J. Turnbull @ 2011-04-11 11:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Jim Meyering, emacs-devel

Eli Zaretskii writes:

 > > However, did you really intend to make the buffer length parameters signed?
 > > I would have expected those to be of type size_t, not ssize_t.
 > 
 > We call these functions with an argument of type EMACS_INT,

If you're going to do that, why not declare it as an EMACS_INT?

The problem with using external standard types is that some developers
will proceed to "correct" them, and the compiler won't tell you about it.
Of course if you happen to be paying attention (which is most of the
time to be sure), the patch will get rejected, but it seems to me
using EMACS_INT will forestall that.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 11:08   ` Jim Meyering
  2011-04-11 11:28     ` David Kastrup
@ 2011-04-11 11:52     ` Eli Zaretskii
  2011-04-11 12:27       ` Jim Meyering
  1 sibling, 1 reply; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-11 11:52 UTC (permalink / raw)
  To: Jim Meyering; +Cc: emacs-devel

> From: Jim Meyering <jim@meyering.net>
> Cc: emacs-devel@gnu.org
> Date: Mon, 11 Apr 2011 13:08:38 +0200
> 
> Currently, emacs_write silently ignores an invalid buffer length,
> treating it just like a length of 0.  It'd be better not to ignore
> such an error.

emacs_write simply does nothing for negative sizes.  However, its
callers will not silently ignore that: emacs_write returns that same
value to the caller, and callers should (and some do) check the return
value for being non-negative.  See, for example, write-region (whose
debugging led to this change in the interface).

> IMHO, an interface that takes a logically unsigned parameter
> should have an unsigned type.

That would be a major inconvenience, and even annoyance: in Emacs, it
is a very frequent idiom to pass the result of subtracting two
EMACS_INT values, because we reference buffers and strings with such
values.  Having the argument as unsigned type would trigger warnings
and will need explicit type casts.  And with type casts, there's the
danger of interpreting a negative value as a large positive one.

So I think on balance, having a signed type there is better.  The fact
that it is slightly narrower is not a problem in this case: EMACS_INT
is already a couple of bits narrower than the size_t type, so we don't
lose anything.

> I guess I'm biased towards least-surprise for developers, so I
> think read- and write-like functions should accept a buffer length
> argument of type size_t, to be consistent with read and write.

I'm sure you agree that the situation with read and write is less than
ideal.  I don't see why we should follow that in Emacs.  A developer
who sees that a function named emacs_write is called instead of write
should assume that emacs_write is not a trivial wrapper, and should
look there to see the details.

> To try to protect against bugs by changing API to a signed type may
> actually cause trouble when callers end up mixing/comparing their
> newly signed (to accommodate the invented API) and unsigned lengths
> from standard functions.

As I said, the normal use cases in Emacs is that the data types used
to compute that argument are signed to begin with.  And the previous
API used `int', which is also a signed type.

> Another thing to keep in mind: on some older systems, trying to read
> more than INT_MAX bytes in a single syscall will fail.

On such system, emacs_write will return either -1 or a value less than
the last arg, and the caller will notice that and produce a suitable
error message.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 11:52     ` Eli Zaretskii
@ 2011-04-11 12:27       ` Jim Meyering
  2011-04-11 12:31         ` David Kastrup
  2011-04-11 14:02         ` Eli Zaretskii
  0 siblings, 2 replies; 49+ messages in thread
From: Jim Meyering @ 2011-04-11 12:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii wrote:

>> From: Jim Meyering <jim@meyering.net>
>> Cc: emacs-devel@gnu.org
>> Date: Mon, 11 Apr 2011 13:08:38 +0200
>>
>> Currently, emacs_write silently ignores an invalid buffer length,
>> treating it just like a length of 0.  It'd be better not to ignore
>> such an error.
>
> emacs_write simply does nothing for negative sizes.  However, its
> callers will not silently ignore that: emacs_write returns that same
> value to the caller, and callers should (and some do) check the return
> value for being non-negative.  See, for example, write-region (whose
> debugging led to this change in the interface).
>
>> IMHO, an interface that takes a logically unsigned parameter
>> should have an unsigned type.
>
> That would be a major inconvenience, and even annoyance: in Emacs, it
> is a very frequent idiom to pass the result of subtracting two
> EMACS_INT values, because we reference buffers and strings with such
> values.  Having the argument as unsigned type would trigger warnings
> and will need explicit type casts.  And with type casts, there's the
> danger of interpreting a negative value as a large positive one.

From my experience with signedness-related warnings,
it is counterproductive to enable options like -Wsign-conversion
and -Wsign-compare, unless you're prepared to cherry-pick fixes for
the few "real" problems and to ignore the others.  Adding casts to work
around them makes the "cure" worse than the disease.

> So I think on balance, having a signed type there is better.  The fact
> that it is slightly narrower is not a problem in this case: EMACS_INT
> is already a couple of bits narrower than the size_t type, so we don't
> lose anything.
>
>> I guess I'm biased towards least-surprise for developers, so I
>> think read- and write-like functions should accept a buffer length
>> argument of type size_t, to be consistent with read and write.

What about when EMACS_INT is defined to "int"?

Someone will inevitably call your write-like function
with a length of type size_t -- many existing uses do just that --
and by using a signed type, you will have converted their long
yet valid (2-4GiB), buffer length, into a negative number.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 12:27       ` Jim Meyering
@ 2011-04-11 12:31         ` David Kastrup
  2011-04-11 21:54           ` Jim Meyering
  2011-04-11 14:02         ` Eli Zaretskii
  1 sibling, 1 reply; 49+ messages in thread
From: David Kastrup @ 2011-04-11 12:31 UTC (permalink / raw)
  To: emacs-devel

Jim Meyering <jim@meyering.net> writes:

> What about when EMACS_INT is defined to "int"?
>
> Someone will inevitably call your write-like function
> with a length of type size_t -- many existing uses do just that --
> and by using a signed type, you will have converted their long
> yet valid (2-4GiB), buffer length, into a negative number.

Resulting in an error or nothing happening.  In contrast, if a negative
number is turned into a long yet valid (2-4GiB) number, it is very
likely that unintended memory areas will get stomped over.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 11:40   ` Stephen J. Turnbull
@ 2011-04-11 13:58     ` Eli Zaretskii
  2011-04-12  1:16       ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-11 13:58 UTC (permalink / raw)
  To: Stephen J. Turnbull; +Cc: jim, emacs-devel

> From: "Stephen J. Turnbull" <stephen@xemacs.org>
> Cc: Jim Meyering <jim@meyering.net>,
>     emacs-devel@gnu.org
> Date: Mon, 11 Apr 2011 20:40:54 +0900
> 
>  > We call these functions with an argument of type EMACS_INT,
> 
> If you're going to do that, why not declare it as an EMACS_INT?

I actually considered that.  The reasons I eventually decided against
it are all minor: I hate to type-cast if I can avoid that, and I try
to avoid passing to library functions data types that are different
from what the header declares.  The fact that it used to be `int' was
also a minor factor.  Finally, sysdep.c doesn't use EMACS_INT, except
in one very special case, so it looked like using standard types was
its "style".

But these are all minor, so if people prefer EMACS_INT, I don't mind.

> The problem with using external standard types is that some developers
> will proceed to "correct" them

"Correct" them to what and why?



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 12:27       ` Jim Meyering
  2011-04-11 12:31         ` David Kastrup
@ 2011-04-11 14:02         ` Eli Zaretskii
  1 sibling, 0 replies; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-11 14:02 UTC (permalink / raw)
  To: Jim Meyering; +Cc: emacs-devel

> From: Jim Meyering <jim@meyering.net>
> Cc: emacs-devel@gnu.org
> Date: Mon, 11 Apr 2011 14:27:36 +0200
> 
> > That would be a major inconvenience, and even annoyance: in Emacs, it
> > is a very frequent idiom to pass the result of subtracting two
> > EMACS_INT values, because we reference buffers and strings with such
> > values.  Having the argument as unsigned type would trigger warnings
> > and will need explicit type casts.  And with type casts, there's the
> > danger of interpreting a negative value as a large positive one.
> 
> >From my experience with signedness-related warnings,
> it is counterproductive to enable options like -Wsign-conversion
> and -Wsign-compare, unless you're prepared to cherry-pick fixes for
> the few "real" problems and to ignore the others.

I agree, but unfortunately -Wall includes warnings about signedness,
and I sometimes use -Wall to make sure I didn't miss something
potentially bad.  And there's no way of knowing what some future
version of GCC will do to save us from ourselves ;-)

> Adding casts to work around them makes the "cure" worse than the
> disease.

Right, which is why I didn't want to use them.

> What about when EMACS_INT is defined to "int"?

David answered that: the result would be less grave than the other way
around.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 12:31         ` David Kastrup
@ 2011-04-11 21:54           ` Jim Meyering
  2011-04-12  4:44             ` Eli Zaretskii
                               ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Jim Meyering @ 2011-04-11 21:54 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

David Kastrup wrote:
> Jim Meyering <jim@meyering.net> writes:
>
>> What about when EMACS_INT is defined to "int"?
>>
>> Someone will inevitably call your write-like function
>> with a length of type size_t -- many existing uses do just that --
>> and by using a signed type, you will have converted their long
>> yet valid (2-4GiB), buffer length, into a negative number.
>
> Resulting in an error or nothing happening.  In contrast, if a negative
> number is turned into a long yet valid (2-4GiB) number, it is very
> likely that unintended memory areas will get stomped over.

Since someone mentioned a goal of being able to edit a 2GiB
file, I thought people here would be sensitive to an API policy
that renders that goal unreachable on some systems.

Of course the extra bit doesn't matter when you start with 64,
but starting with only 32, using an unsigned "length" would
avoid that unnecessary limit.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 13:58     ` Eli Zaretskii
@ 2011-04-12  1:16       ` Paul Eggert
  2011-04-12  3:01         ` Eli Zaretskii
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2011-04-12  1:16 UTC (permalink / raw)
  To: emacs-devel

emacs_read and emacs_write are a bit of a special case,
because by design Emacs always invokes them with a
small positive size (currently limited to 64 KiB, if memory
serves).  So any signature will do; it really doesn't matter.
The old signature worked; the new one works; and making
the signature compatible with ordinary 'read' and 'write'
would also work.  Of these choices, I mildly prefer using
the 'read' and 'write' signature since that's what C
programmers would expect; and second would prefer
going back the way it used to be (using the argument that
if it wasn't broken, why fix it....).

To some extent this discussion is a proxy for what the
internal coding style of Emacs should be, for integers.
Here are some suggestions:

* When dealing with system objects, such as file descriptors
  and C object sizes, use the relevant system types,
  such as 'int' and 'size_t'.

* When dealing with Emacs fixnums, which are always signed,
  use EMACS_INT.

* Use EMACS_UINT only when EMACS_INT would yield undefined
  behavior due to integer overflow.

* If overflow is possible when calculating values of a type,
  e.g., when converting from one type to another, then
  check this at runtime.

I realize these guidelines are often violated (particularly the
last one :-), but using them would help make Emacs internals
more reliable.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12  1:16       ` Paul Eggert
@ 2011-04-12  3:01         ` Eli Zaretskii
  2011-04-12  5:06           ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-12  3:01 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> Date: Mon, 11 Apr 2011 18:16:40 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> 
> emacs_read and emacs_write are a bit of a special case,
> because by design Emacs always invokes them with a
> small positive size (currently limited to 64 KiB, if memory
> serves).

That's not true.  When Emacs saves a buffer or some its portion,
write-region can call emacs_write (though a_write and e_write) with
the full extent of the region to be saved.  At least with a pure-ASCII
file, that's how write-region works.  Just step through the code with
a debugger, and you will see it.

That's why I changed the signature (and the local variables inside
emacs_write): without that, you simply cannot save regions that are
larger than 2GB.  The previous version would leave the saved file
empty, if invoked on large buffers.

So it is not a theoretically motivated change; it had very specific
real reason.  The old signature simply didn't work in that case.

> * When dealing with system objects, such as file descriptors
>   and C object sizes, use the relevant system types,
>   such as 'int' and 'size_t'.
> 
> * When dealing with Emacs fixnums, which are always signed,
>   use EMACS_INT.

The issue here is that emacs_write and emacs_read are on the boundary:
they accept Emacs Lisp integers, but then call a system API.  Thus the
dilemma what to use in this case, which I resolved in favor of the
system type.  sysdep.c in general uses system types.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 21:54           ` Jim Meyering
@ 2011-04-12  4:44             ` Eli Zaretskii
  2011-04-12 13:24             ` Ted Zlatanov
  2011-04-14 20:57             ` oops? read/write vs type of length parameter Michael Welsh Duggan
  2 siblings, 0 replies; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-12  4:44 UTC (permalink / raw)
  To: Jim Meyering; +Cc: dak, emacs-devel

> From: Jim Meyering <jim@meyering.net>
> Date: Mon, 11 Apr 2011 23:54:06 +0200
> Cc: emacs-devel@gnu.org
> 
> Of course the extra bit doesn't matter when you start with 64,
> but starting with only 32, using an unsigned "length" would
> avoid that unnecessary limit.

I'm not sure what you are suggesting (``starting with only 32, using
an unsigned "length"''), but if this means to use a 32-bit type as the
last argument to emacs_write, then that was the bug I fixed by using
ssize_t instead of an int.  The bug prevented Emacs from saving
regions larger than INT_MAX.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12  3:01         ` Eli Zaretskii
@ 2011-04-12  5:06           ` Paul Eggert
  2011-04-12  5:46             ` Eli Zaretskii
                               ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Paul Eggert @ 2011-04-12  5:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 04/11/2011 08:01 PM, Eli Zaretskii wrote:
> When Emacs saves a buffer or some its portion,
> write-region can call emacs_write (though a_write and e_write) with
> the full extent of the region to be saved.

Ah, sorry, I missed that one.  In that case 'int'
clearly won't do for the size.

> The issue here is that emacs_write and emacs_read are on the boundary:
> they accept Emacs Lisp integers, but then call a system API.

No, they are regularly passed size_t values as sizes,
in other sections of the code.  I just now counted,
and the 16 calls to emacs_write use int constants
6 times, size_t 5 times, EMACS_INT 3 times, and an
int variable once.  So, judging by the caller's usages,
size_t would seem more appropriate.

Furthermore, the API for emacs_write should be designed
to let callers know what emacs_write should do.  Since emacs_write
operates on size values that fit into size_t, that would
seem to be the more appropriate type for its size argument.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12  5:06           ` Paul Eggert
@ 2011-04-12  5:46             ` Eli Zaretskii
  2011-04-12  8:19             ` Paul Eggert
  2011-04-12 12:32             ` Davis Herring
  2 siblings, 0 replies; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-12  5:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> Date: Mon, 11 Apr 2011 22:06:21 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: emacs-devel@gnu.org
> 
> On 04/11/2011 08:01 PM, Eli Zaretskii wrote:
> > When Emacs saves a buffer or some its portion,
> > write-region can call emacs_write (though a_write and e_write) with
> > the full extent of the region to be saved.
> 
> Ah, sorry, I missed that one.  In that case 'int'
> clearly won't do for the size.

The same is true for emacs_read, btw; see insert-file-contents.

> > The issue here is that emacs_write and emacs_read are on the boundary:
> > they accept Emacs Lisp integers, but then call a system API.
> 
> No, they are regularly passed size_t values as sizes,
> in other sections of the code.  I just now counted,
> and the 16 calls to emacs_write use int constants
> 6 times, size_t 5 times, EMACS_INT 3 times, and an
> int variable once.  So, judging by the caller's usages,
> size_t would seem more appropriate.

Granted, I reviewed all those other callers when I made my change.
All those I left at their original form are passing small values in
this argument.  In particular, all the callers that use size_t get
those values by calling strlen on short strings.  And int is obviously
not a problem at all.

So this theoretical issue has no practical consequences in the case in
point.

However, if it's deemed good practice to make them 100% bulletproof,
I'm not opposed to change them all use EMACS_INT or ssize_t, although
that would require type casts that I deplore.

> Furthermore, the API for emacs_write should be designed
> to let callers know what emacs_write should do.  Since emacs_write
> operates on size values that fit into size_t, that would
> seem to be the more appropriate type for its size argument.

As was already explained here, this has downsides that are more
serious than the current use of ssize_t (as long as programming
mistakes are concerned).  So this suggestion I do oppose.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12  5:06           ` Paul Eggert
  2011-04-12  5:46             ` Eli Zaretskii
@ 2011-04-12  8:19             ` Paul Eggert
  2011-04-12  9:41               ` Eli Zaretskii
  2011-04-12 12:32             ` Davis Herring
  2 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2011-04-12  8:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Jim Meyering, emacs-devel

On looking over that code again, a couple of issues
sprang out.  First, emacs_read should act like emacs_write
with respect to sizes, but the code didn't do that.

Second, no caller should ever pass a negative size value
to either function, and callers should not rely on negative
sizes causing emacs_read and emacs_write to do nothing.
I added a runtime check for this, which I don't think
will ever fail, but I've been surprised in the past.
With that check in place we might as well use size_t for the size,
with the goal of removing the runtime checks once we have
carefully checked that they aren't needed.

Here's the patch I installed for that.

* sysdep.c (emacs_read, emacs_write): Check for negative sizes
since callers should never pass a negative size.
Change the signature to match that of plain 'read' and 'write'; see
<http://lists.gnu.org/archive/html/emacs-devel/2011-04/msg00397.html>.
* lisp.h: Update prototypes of emacs_write and emacs_read.
=== modified file 'src/lisp.h'
--- src/lisp.h	2011-04-10 20:43:08 +0000
+++ src/lisp.h	2011-04-12 08:05:04 +0000
@@ -3346,8 +3346,8 @@
 extern void seed_random (long);
 extern int emacs_open (const char *, int, int);
 extern int emacs_close (int);
-extern ssize_t emacs_read (int, char *, ssize_t);
-extern ssize_t emacs_write (int, const char *, ssize_t);
+extern ssize_t emacs_read (int, char *, size_t);
+extern ssize_t emacs_write (int, const char *, size_t);
 enum { READLINK_BUFSIZE = 1024 };
 extern char *emacs_readlink (const char *, char [READLINK_BUFSIZE]);
 #ifndef HAVE_MEMSET

=== modified file 'src/sysdep.c'
--- src/sysdep.c	2011-04-10 20:43:08 +0000
+++ src/sysdep.c	2011-04-12 08:05:09 +0000
@@ -1826,10 +1826,18 @@
 }

 ssize_t
-emacs_read (int fildes, char *buf, ssize_t nbyte)
+emacs_read (int fildes, char *buf, size_t nbyte)
 {
   register ssize_t rtnval;

+  /* Defend against the possibility that a buggy caller passes a negative NBYTE
+     argument, which would be converted to a large unsigned size_t NBYTE.  This
+     defense prevents callers from doing large writes, unfortunately.  This
+     size restriction can be removed once we have carefully checked that there
+     are no such callers.  */
+  if ((ssize_t) nbyte < 0)
+    abort ();
+
   while ((rtnval = read (fildes, buf, nbyte)) == -1
 	 && (errno == EINTR))
     QUIT;
@@ -1837,13 +1845,17 @@
 }

 ssize_t
-emacs_write (int fildes, const char *buf, ssize_t nbyte)
+emacs_write (int fildes, const char *buf, size_t nbyte)
 {
   register ssize_t rtnval, bytes_written;

+  /* Defend against negative NBYTE, as in emacs_read.  */
+  if ((ssize_t) nbyte < 0)
+    abort ();
+
   bytes_written = 0;

-  while (nbyte > 0)
+  while (nbyte != 0)
     {
       rtnval = write (fildes, buf, nbyte);





^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12  8:19             ` Paul Eggert
@ 2011-04-12  9:41               ` Eli Zaretskii
  2011-04-12 15:53                 ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-12  9:41 UTC (permalink / raw)
  To: Paul Eggert; +Cc: jim, emacs-devel

> Date: Tue, 12 Apr 2011 01:19:10 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: emacs-devel@gnu.org, Jim Meyering <jim@meyering.net>
> 
> I added a runtime check for this, which I don't think
> will ever fail, but I've been surprised in the past.

If it ever fails, aborting is too harsh, I think.  The original code
was well defended against that possibility, see write-region.  It
would signal an IO error.

> With that check in place we might as well use size_t for the size,

Which will cause annoying compiler warnings, at least with some
optional switches.

> with the goal of removing the runtime checks once we have
> carefully checked that they aren't needed.

Which will never happen, so these aborts will stay in the code
forever.

> Here's the patch I installed for that.

I don't understand why you went ahead and installed such a change,
although it was clear that your opinion on this is being disputed, and
at least I explicitly expressed my disagreement with changing that
argument to an unsigned type.  As long as you are not the head
maintainer, I think such unilateral actions are inappropriate.

But if I'm the only one who objects to that (both the change and
disregarding the disagreement), then so be it.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12  5:06           ` Paul Eggert
  2011-04-12  5:46             ` Eli Zaretskii
  2011-04-12  8:19             ` Paul Eggert
@ 2011-04-12 12:32             ` Davis Herring
  2011-04-12 13:38               ` Eli Zaretskii
  2 siblings, 1 reply; 49+ messages in thread
From: Davis Herring @ 2011-04-12 12:32 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, emacs-devel

>> When Emacs saves a buffer or some its portion,
>> write-region can call emacs_write (though a_write and e_write) with
>> the full extent of the region to be saved.
>
> Ah, sorry, I missed that one.  In that case 'int'
> clearly won't do for the size.

This has been said several times, apparently assuming that the whole
length _must_ be passed.  But other callers call these functions with
small arguments in a loop.  Can't that be used (with perhaps more like 64
MiB than 64 KiB per call) for these large files too?

Davis

-- 
This product is sold by volume, not by mass.  If it appears too dense or
too sparse, it is because mass-energy conversion has occurred during
shipping.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 21:54           ` Jim Meyering
  2011-04-12  4:44             ` Eli Zaretskii
@ 2011-04-12 13:24             ` Ted Zlatanov
  2011-04-12 13:29               ` Eli Zaretskii
  2011-04-14 20:57             ` oops? read/write vs type of length parameter Michael Welsh Duggan
  2 siblings, 1 reply; 49+ messages in thread
From: Ted Zlatanov @ 2011-04-12 13:24 UTC (permalink / raw)
  To: emacs-devel

On Mon, 11 Apr 2011 23:54:06 +0200 Jim Meyering <jim@meyering.net> wrote: 

JM> Since someone mentioned a goal of being able to edit a 2GiB
JM> file, I thought people here would be sensitive to an API policy
JM> that renders that goal unreachable on some systems.

Just to add my vote: I think Emacs should be able to open and edit 2 GB
and larger files, although IIUC the API policy discussed here doesn't
remove that possibility.

Ted




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12 13:24             ` Ted Zlatanov
@ 2011-04-12 13:29               ` Eli Zaretskii
  2011-04-12 14:47                 ` Ted Zlatanov
  0 siblings, 1 reply; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-12 13:29 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Tue, 12 Apr 2011 08:24:27 -0500
> 
> On Mon, 11 Apr 2011 23:54:06 +0200 Jim Meyering <jim@meyering.net> wrote: 
> 
> JM> Since someone mentioned a goal of being able to edit a 2GiB
> JM> file, I thought people here would be sensitive to an API policy
> JM> that renders that goal unreachable on some systems.
> 
> Just to add my vote: I think Emacs should be able to open and edit 2 GB
> and larger files, although IIUC the API policy discussed here doesn't
> remove that possibility.

Of course, it doesn't!  I made that change to be able to edit a 2.7GB
file, and after the change I actually was able to do that, whereas
before the change, I couldn't save the file after editing it.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12 12:32             ` Davis Herring
@ 2011-04-12 13:38               ` Eli Zaretskii
  2011-04-12 15:43                 ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-12 13:38 UTC (permalink / raw)
  To: herring; +Cc: eggert, emacs-devel

> Date: Tue, 12 Apr 2011 05:32:30 -0700 (PDT)
> From: "Davis Herring" <herring@lanl.gov>
> Cc: "Eli Zaretskii" <eliz@gnu.org>, emacs-devel@gnu.org
> Reply-To: herring@lanl.gov
> 
> >> When Emacs saves a buffer or some its portion,
> >> write-region can call emacs_write (though a_write and e_write) with
> >> the full extent of the region to be saved.
> >
> > Ah, sorry, I missed that one.  In that case 'int'
> > clearly won't do for the size.
> 
> This has been said several times, apparently assuming that the whole
> length _must_ be passed.  But other callers call these functions with
> small arguments in a loop.

AFAIK, only one caller does that, to send data to a subprocess.  Such
piecemeal writes are justified in that case, because (AFAIK) pipes are
generally limited in how much stuff they can hold, and because async
communications with other processes are relatively slow anyway.

> Can't that be used (with perhaps more like 64 MiB than 64 KiB per
> call) for these large files too?

That would just slow down Emacs's I/O for no good reason.  We should
trust the OS that it knows better how to optimize disk I/O.  OTOH,
nothing is lost by using ssize_t data type to write the whole buffer,
because Emacs cannot have buffers that large anyway: we reserve a
small number of bits for Lisp tags, and therefore the largest value we
could ever put in that argument is MOST_POSITIVE_FIXNUM, which is
always less than what ssize_t can hold.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12 13:29               ` Eli Zaretskii
@ 2011-04-12 14:47                 ` Ted Zlatanov
  2011-04-12 17:00                   ` Large file support (was: oops? read/write vs type of length parameter) Eli Zaretskii
  0 siblings, 1 reply; 49+ messages in thread
From: Ted Zlatanov @ 2011-04-12 14:47 UTC (permalink / raw)
  To: emacs-devel

On Tue, 12 Apr 2011 09:29:29 -0400 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> 
>> Just to add my vote: I think Emacs should be able to open and edit 2 GB
>> and larger files, although IIUC the API policy discussed here doesn't
>> remove that possibility.

EZ> Of course, it doesn't!  I made that change to be able to edit a 2.7GB
EZ> file, and after the change I actually was able to do that, whereas
EZ> before the change, I couldn't save the file after editing it.

Wait, Emacs can edit large files now?  That is great news!  I didn't see
it in the NEWS file, thanks for doing it.  Is it documented anywhere?

Ted




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12 13:38               ` Eli Zaretskii
@ 2011-04-12 15:43                 ` Paul Eggert
  0 siblings, 0 replies; 49+ messages in thread
From: Paul Eggert @ 2011-04-12 15:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> Can't that be used (with perhaps more like 64 MiB than 64 KiB per
>> call) for these large files too?
> That would just slow down Emacs's I/O for no good reason. 

There may be some good reason, actually.  In some cases
there are serious bugs with Linux kernels before 2.6.16,
when an application writes huge buffers.  That what was
MAX_RW_COUNT was introduced for, inside the kernel.
Now, Linux 'write' calls never return more than that value.
This just papers over some lower-level bugs, but there it is.

With this in mind, perhaps Emacs should refuse to write more than
MAX_RW_COUNT (i.e., INT_MAX - 4095) bytes at a time, regardless
of whether the kernel itself imposes the limit, as larger writes are
likely to tickle bugs in older Linux kernels.

See, for example:

https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=612839



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12  9:41               ` Eli Zaretskii
@ 2011-04-12 15:53                 ` Paul Eggert
  2011-04-12 16:56                   ` Eli Zaretskii
                                     ` (3 more replies)
  0 siblings, 4 replies; 49+ messages in thread
From: Paul Eggert @ 2011-04-12 15:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jim, emacs-devel

On 04/12/2011 02:41 AM, Eli Zaretskii wrote:
>> Date: Tue, 12 Apr 2011 01:19:10 -0700
>> From: Paul Eggert <eggert@cs.ucla.edu>
>> CC: emacs-devel@gnu.org, Jim Meyering <jim@meyering.net>
>>
>> I added a runtime check for this, which I don't think
>> will ever fail, but I've been surprised in the past.
> 
> If it ever fails, aborting is too harsh, I think.  The original code
> was well defended against that possibility, see write-region.  It
> would signal an IO error.

If the code is well-defended against passing negative sizes,
then we can remove the checks entirely; they're not needed.
They were meant only as a stopgap in case the callers were
buggy.  (Also please see below.)

>> With that check in place we might as well use size_t for the size,
> 
> Which will cause annoying compiler warnings, at least with some
> optional switches.

Any compiler that will warn about conversions to size_t will
warn about hundreds of other places in Emacs.  These warnings
are more trouble than they're worth, and we shouldn't worry
about them.

>> with the goal of removing the runtime checks once we have
>> carefully checked that they aren't needed.
> 
> Which will never happen, so these aborts will stay in the code
> forever.

I can do it, in the next day or two.  It shouldn't be a problem.

I'm sorry if there are hurt feelings about this, but size_t is
clearly the better choice for buffer sizes; it's the universal
standard in the C API.  ssize_t is never used for that.  And this
is the sysdep.c module after all.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12 15:53                 ` Paul Eggert
@ 2011-04-12 16:56                   ` Eli Zaretskii
  2011-04-12 23:55                   ` Juanma Barranquero
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-12 16:56 UTC (permalink / raw)
  To: Paul Eggert, Stefan Monnier, Chong Yidong; +Cc: emacs-devel

> Date: Tue, 12 Apr 2011 08:53:37 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: emacs-devel@gnu.org, jim@meyering.net
> 
> If the code is well-defended against passing negative sizes,
> then we can remove the checks entirely; they're not needed.

My experience is that you can never be well defended against mixing
signed with unsigned.

Stefan and Chong, please voice your opinions about this.  I'm strongly
opposed to having the last argument of emacs_read and emacs_write be
of unsigned data type, but if you are okay with that, I'll get over
it.

> I'm sorry if there are hurt feelings about this, but size_t is
> clearly the better choice for buffer sizes; it's the universal
> standard in the C API.

Those hurt feeling would have been avoided if you waited until the
discussion is over and decision is made.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Large file support (was: oops? read/write vs type of length parameter)
  2011-04-12 14:47                 ` Ted Zlatanov
@ 2011-04-12 17:00                   ` Eli Zaretskii
  0 siblings, 0 replies; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-12 17:00 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Tue, 12 Apr 2011 09:47:14 -0500
> 
> EZ> Of course, it doesn't!  I made that change to be able to edit a 2.7GB
> EZ> file, and after the change I actually was able to do that, whereas
> EZ> before the change, I couldn't save the file after editing it.
> 
> Wait, Emacs can edit large files now?  That is great news!  I didn't see
> it in the NEWS file, thanks for doing it.  Is it documented anywhere?

First, this is only possible on a 64-bit host.  And second, we are not
there yet, although we are close, as the most basic editing already
works.  However, several bugs still remain, and I hope to work on
squashing them soon.  When it's done, there surely will be an
announcement here.

As for documentation, there's nothing to document, because we don't
document bugs we fixed ;-)




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12 15:53                 ` Paul Eggert
  2011-04-12 16:56                   ` Eli Zaretskii
@ 2011-04-12 23:55                   ` Juanma Barranquero
  2011-04-13  5:14                   ` Paul Eggert
  2011-04-15  1:29                   ` Stefan Monnier
  3 siblings, 0 replies; 49+ messages in thread
From: Juanma Barranquero @ 2011-04-12 23:55 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, jim, emacs-devel

On Tue, Apr 12, 2011 at 17:53, Paul Eggert <eggert@cs.ucla.edu> wrote:

> I'm sorry if there are hurt feelings about this,

It's not a question of hurt feelings, but about waiting for the
discussion to reach some kind of consensus.

> but size_t is
> clearly the better choice for buffer sizes;

If it were "clearly the better choice" there would be no opposition.

    Juanma



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12 15:53                 ` Paul Eggert
  2011-04-12 16:56                   ` Eli Zaretskii
  2011-04-12 23:55                   ` Juanma Barranquero
@ 2011-04-13  5:14                   ` Paul Eggert
  2011-04-13  6:31                     ` Jim Meyering
                                       ` (3 more replies)
  2011-04-15  1:29                   ` Stefan Monnier
  3 siblings, 4 replies; 49+ messages in thread
From: Paul Eggert @ 2011-04-13  5:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jim, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1852 bytes --]

On 04/12/2011 08:53 AM, Paul Eggert wrote:
> I can do it, in the next day or two.

OK, I did it, by auditing all uses of emacs_read and emacs_write to
find and fix all negative buffer sizes.  However, instead of negative
sizes I found overflows and some other issues, described below; and I
therefore now agree with Eli that emacs_write shouldn't use the same
signature as POSIX 'write'.  However, I also agree with Jim that
emacs_write's API need not artificially limit buffer sizes to SSIZE_MAX.

Here's the key idea of the attached patch.  Since
"N = emacs_write (FD, BUF, SIZE)" returns N if successful, and some
value less than N if it fails, the caller can easily determine whether
it succeeded by testing N == SIZE.  So on failure there's no need for
emacs_write to return a special value like -1.

With this in mind, emacs_write can simply return the number of bytes
written, which will always be a size_t value.  Jim, this idea is
already used by gnulib's full_write function, so there's good
precedent for departing from the POSIX signature here.  And Eli, this
avoids the problem where the size is so large that a signed return
value would overflow and become negative.

Here are some other bugs in Emacs that I found as part of this audit:

 * Several places in sound.c attempt to stuff a size into an 'int',
   which doesn't work.  This problem is independent of the above, and
   its fix is independent of the above too, so it's split into a
   separate patch in the attached bundle.

 * send_process attempts to stuff a size into an 'int' as well.  This
   can be fixed by changing one of its local variables from int to size_t.

 * emacs_gnutls_write has the same issues as emacs_write, and should
   be fixed the same way.

Attached is a bzr bundle of two proposed patches to fix the problem as
described above.  Comments are welcome.

[-- Attachment #2: patch-emacs-write.txt --]
[-- Type: text/plain, Size: 14688 bytes --]

# Bazaar merge directive format 2 (Bazaar 0.90)
# revision_id: eggert@cs.ucla.edu-20110413050254-d2prvw3s5s4hua1b
# target_branch: bzr+ssh://eggert@bzr.savannah.gnu.org/emacs/trunk
# testament_sha1: 3eeb60f5950d82650e3b759001afaed50f02f05d
# timestamp: 2011-04-12 22:02:59 -0700
# base_revision_id: lekktu@gmail.com-20110413021642-0n12owb02iz2xwiq
# 
# Begin patch
=== modified file 'src/ChangeLog'
--- src/ChangeLog	2011-04-12 10:20:32 +0000
+++ src/ChangeLog	2011-04-13 05:02:54 +0000
@@ -1,3 +1,27 @@
+2011-04-13  Paul Eggert  <eggert@cs.ucla.edu>
+
+	emacs_write: Return size_t, not ssize_t, to avoid overflow issues.
+	* gnutls.c, gnutls.h (emacs_gnutls_write): Return size_t, not ssize_t.
+	* sysdep.c, lisp.h (emacs_write): Likewise.
+	Without the above change, emacs_gnutls_write and emacs_write had
+	undefined behavior and would typically mistakenly report an error
+	when writing a buffer whose size exceeds SSIZE_MAX.
+	(emacs_read, emacs_write): Remove check for negative size, as the
+	Emacs source code has been audited now.
+	(emacs_write): Adjust to new signature, making the code look more
+	like that of emacs_gnutls_write.
+	* process.c (send_process): Adjust to the new signatures of
+	emacs_write and emacs_gnutls_write.  Do not attempt to store
+	a byte offset into an 'int'; it might overflow.
+
+	* sound.c: Don't assume sizes fit in 'int'.
+	(struct sound_device.period_size, alsa_period_size):
+	Return size_t, not int.
+	(struct sound_device.write, vox_write, alsa_write):
+	Accept size_t, not int.
+	(wav_play, au_play): Use size_t to store sizes, and ssize_t to
+	record read return values.
+
 2011-04-12  Andreas Schwab  <schwab@linux-m68k.org>
 
 	* charset.c (Fclear_charset_maps): Use xfree instead of free.

=== modified file 'src/gnutls.c'
--- src/gnutls.c	2011-04-10 14:00:13 +0000
+++ src/gnutls.c	2011-04-13 05:02:54 +0000
@@ -70,7 +70,7 @@
     }
 }
 
-ssize_t
+size_t
 emacs_gnutls_write (int fildes, struct Lisp_Process *proc, const char *buf,
                     size_t nbyte)
 {
@@ -85,7 +85,7 @@
 #ifdef EAGAIN
     errno = EAGAIN;
 #endif
-    return -1;
+    return 0;
   }
 
   bytes_written = 0;
@@ -99,7 +99,7 @@
           if (rtnval == GNUTLS_E_AGAIN || rtnval == GNUTLS_E_INTERRUPTED)
             continue;
           else
-            return (bytes_written ? bytes_written : -1);
+            break;
         }
 
       buf += rtnval;

=== modified file 'src/gnutls.h'
--- src/gnutls.h	2011-04-10 14:00:13 +0000
+++ src/gnutls.h	2011-04-13 05:02:54 +0000
@@ -50,7 +50,7 @@
 
 #define GNUTLS_LOG2(level, max, string, extra) if (level <= max) { gnutls_log_function2 (level, "(Emacs) " string, extra); }
 
-ssize_t
+size_t
 emacs_gnutls_write (int fildes, struct Lisp_Process *proc, const char *buf,
                     size_t nbyte);
 ssize_t

=== modified file 'src/lisp.h'
--- src/lisp.h	2011-04-12 08:12:01 +0000
+++ src/lisp.h	2011-04-13 05:02:54 +0000
@@ -3347,7 +3347,7 @@
 extern int emacs_open (const char *, int, int);
 extern int emacs_close (int);
 extern ssize_t emacs_read (int, char *, size_t);
-extern ssize_t emacs_write (int, const char *, size_t);
+extern size_t emacs_write (int, const char *, size_t);
 enum { READLINK_BUFSIZE = 1024 };
 extern char *emacs_readlink (const char *, char [READLINK_BUFSIZE]);
 #ifndef HAVE_MEMSET

=== modified file 'src/process.c'
--- src/process.c	2011-04-10 14:00:13 +0000
+++ src/process.c	2011-04-13 05:02:54 +0000
@@ -5367,6 +5367,7 @@
 	  /* Send this batch, using one or more write calls.  */
 	  while (this > 0)
 	    {
+	      size_t written = 0;
 	      int outfd = p->outfd;
 	      old_sigpipe = (void (*) (int)) signal (SIGPIPE, send_process_trap);
 #ifdef DATAGRAM_SOCKETS
@@ -5375,7 +5376,9 @@
 		  rv = sendto (outfd, buf, this,
 			       0, datagram_address[outfd].sa,
 			       datagram_address[outfd].len);
-		  if (rv < 0 && errno == EMSGSIZE)
+		  if (0 <= rv)
+		    written = rv;
+		  else if (errno == EMSGSIZE)
 		    {
 		      signal (SIGPIPE, old_sigpipe);
 		      report_file_error ("sending datagram",
@@ -5387,12 +5390,13 @@
 		{
 #ifdef HAVE_GNUTLS
 		  if (XPROCESS (proc)->gnutls_p)
-		    rv = emacs_gnutls_write (outfd,
-					     XPROCESS (proc),
-					     buf, this);
+		    written = emacs_gnutls_write (outfd,
+						 XPROCESS (proc),
+						 buf, this);
 		  else
 #endif
-		    rv = emacs_write (outfd, buf, this);
+		    written = emacs_write (outfd, buf, this);
+		  rv = (written == this ? 0 : -1);
 #ifdef ADAPTIVE_READ_BUFFERING
 		  if (p->read_output_delay > 0
 		      && p->adaptive_read_buffering == 1)
@@ -5419,7 +5423,7 @@
 		       that may allow the program
 		       to finish doing output and read more.  */
 		    {
-		      int offset = 0;
+		      size_t offset = 0;
 
 #ifdef BROKEN_PTY_READ_AFTER_EAGAIN
 		      /* A gross hack to work around a bug in FreeBSD.
@@ -5465,16 +5469,14 @@
 							 offset);
 		      else if (STRINGP (object))
 			buf = offset + SSDATA (object);
-
-		      rv = 0;
 		    }
 		  else
 		    /* This is a real error.  */
 		    report_file_error ("writing to process", Fcons (proc, Qnil));
 		}
-	      buf += rv;
-	      len -= rv;
-	      this -= rv;
+	      buf += written;
+	      len -= written;
+	      this -= written;
 	    }
 	}
     }

=== modified file 'src/sound.c'
--- src/sound.c	2011-04-12 10:20:32 +0000
+++ src/sound.c	2011-04-13 03:22:40 +0000
@@ -235,11 +235,11 @@
 
   /* Return a preferred data size in bytes to be sent to write (below)
      each time.  2048 is used if this is NULL.  */
-  int (* period_size) (struct sound_device *sd);
+  size_t (* period_size) (struct sound_device *sd);
 
   /* Write NYBTES bytes from BUFFER to device SD.  */
   void (* write) (struct sound_device *sd, const char *buffer,
-                  int nbytes);
+                  size_t nbytes);
 
   /* A place for devices to store additional data.  */
   void *data;
@@ -291,7 +291,7 @@
 static void vox_close (struct sound_device *sd);
 static void vox_choose_format (struct sound_device *, struct sound *);
 static int vox_init (struct sound_device *);
-static void vox_write (struct sound_device *, const char *, int);
+static void vox_write (struct sound_device *, const char *, size_t);
 static void find_sound_type (struct sound *);
 static u_int32_t le2hl (u_int32_t);
 static u_int16_t le2hs (u_int16_t);
@@ -600,9 +600,9 @@
   else
     {
       char *buffer;
-      int nbytes = 0;
-      int blksize = sd->period_size ? sd->period_size (sd) : 2048;
-      int data_left = header->data_length;
+      ssize_t nbytes = 0;
+      size_t blksize = sd->period_size ? sd->period_size (sd) : 2048;
+      size_t data_left = header->data_length;
 
       buffer = (char *) alloca (blksize);
       lseek (s->fd, sizeof *header, SEEK_SET);
@@ -690,9 +690,9 @@
 	       SBYTES (s->data) - header->data_offset);
   else
     {
-      int blksize = sd->period_size ? sd->period_size (sd) : 2048;
+      size_t blksize = sd->period_size ? sd->period_size (sd) : 2048;
       char *buffer;
-      int nbytes;
+      ssize_t nbytes;
 
       /* Seek */
       lseek (s->fd, header->data_offset, SEEK_SET);
@@ -895,10 +895,9 @@
 /* Write NBYTES bytes from BUFFER to device SD.  */
 
 static void
-vox_write (struct sound_device *sd, const char *buffer, int nbytes)
+vox_write (struct sound_device *sd, const char *buffer, size_t nbytes)
 {
-  ssize_t nwritten = emacs_write (sd->fd, buffer, nbytes);
-  if (nwritten < 0)
+  if (emacs_write (sd->fd, buffer, nbytes) != nbytes)
     sound_perror ("Error writing to sound device");
 }
 
@@ -953,7 +952,7 @@
     alsa_sound_perror (file, err);
 }
 
-static int
+static size_t
 alsa_period_size (struct sound_device *sd)
 {
   struct alsa_params *p = (struct alsa_params *) sd->data;
@@ -1156,13 +1155,13 @@
 /* Write NBYTES bytes from BUFFER to device SD.  */
 
 static void
-alsa_write (struct sound_device *sd, const char *buffer, int nbytes)
+alsa_write (struct sound_device *sd, const char *buffer, size_t nbytes)
 {
   struct alsa_params *p = (struct alsa_params *) sd->data;
 
   /* The the third parameter to snd_pcm_writei is frames, not bytes. */
   int fact = snd_pcm_format_size (sd->format, 1) * sd->channels;
-  int nwritten = 0;
+  size_t nwritten = 0;
   int err;
 
   while (nwritten < nbytes)

=== modified file 'src/sysdep.c'
--- src/sysdep.c	2011-04-12 08:12:01 +0000
+++ src/sysdep.c	2011-04-13 05:02:54 +0000
@@ -1825,41 +1825,36 @@
   return rtnval;
 }
 
+/* Read from FILEDESC to a buffer BUF with size NBYTE, retrying if interrupted.
+   Return the number of bytes read, which might be less than NBYTE.
+   On error, set errno and return -1.  */
 ssize_t
 emacs_read (int fildes, char *buf, size_t nbyte)
 {
   register ssize_t rtnval;
 
-  /* Defend against the possibility that a buggy caller passes a negative NBYTE
-     argument, which would be converted to a large unsigned size_t NBYTE.  This
-     defense prevents callers from doing large writes, unfortunately.  This
-     size restriction can be removed once we have carefully checked that there
-     are no such callers.  */
-  if ((ssize_t) nbyte < 0)
-    abort ();
-
   while ((rtnval = read (fildes, buf, nbyte)) == -1
 	 && (errno == EINTR))
     QUIT;
   return (rtnval);
 }
 
-ssize_t
+/* Write to FILEDES from a buffer BUF with size NBYTE, retrying if interrupted
+   or if a partial write occurs.  Return the number of bytes written, setting
+   errno if this is less than NBYTE.  */
+size_t
 emacs_write (int fildes, const char *buf, size_t nbyte)
 {
-  register ssize_t rtnval, bytes_written;
-
-  /* Defend against negative NBYTE, as in emacs_read.  */
-  if ((ssize_t) nbyte < 0)
-    abort ();
+  ssize_t rtnval;
+  size_t bytes_written;
 
   bytes_written = 0;
 
-  while (nbyte != 0)
+  while (nbyte > 0)
     {
       rtnval = write (fildes, buf, nbyte);
 
-      if (rtnval == -1)
+      if (rtnval < 0)
 	{
 	  if (errno == EINTR)
 	    {
@@ -1871,13 +1866,14 @@
 	      continue;
 	    }
 	  else
-	    return (bytes_written ? bytes_written : -1);
+	    break;
 	}
 
       buf += rtnval;
       nbyte -= rtnval;
       bytes_written += rtnval;
     }
+
   return (bytes_written);
 }
 

# Begin bundle
IyBCYXphYXIgcmV2aXNpb24gYnVuZGxlIHY0CiMKQlpoOTFBWSZTWVgbf/MACULfgFBxcP////+n
3vC////wYA/8AHwERVAAAAAEIAAoUAAFACgARwDCMJpiGAQDIAYRpkyYRgIaHAMIwmmIYBAMgBhG
mTJhGAhocAwjCaYhgEAyAGEaZMmEYCGglPUiTCNCMQnpNTxTZpT1NNqZDEDJkaNNqeiepocAwjCa
YhgEAyAGEaZMmEYCGgVRCAIAQEyZEaegQQxDRHqDR6j1DCZIKiEJFhAwijCEGc2bM+zkqzMy64i1
5PUzPr2z7/27em3Kxnx8Vefn8Wm83t95+6aDYMWRDVqxEsk6cTG/2LqrWqdFK10RgZJjeP310LG5
sNCKW34WGTRJHDZrprpR3bNeHDfjOGciLUCFIgqcVY1ZhXGlJx057JGcKWi425TTgmGE1Yupde4X
xONxsy3cKGt4uOIgjjuQVhEUjUbUrpZslsmZg4PgFvB7hR05mP0pMe4bk+x0GIkxk48x3dQGAHJE
IsKAqkCqpAsHZy9luML/QEcPhmDZgayknNk5bOSzmXmrhS+6+21i9F57psSjGF0TeKayS1tyTbbB
hkssymSQvSrZZNWxitluWtbb6Xspa2NG9JauWa0ckykpOSVtS5PAfCdug+W+TxSS7XtpS0uZ6mKj
q0jH9bMGTB9tkUZqK/HDH0eWb4Fm5UpVuSsS2LK6vqUXaf9FJZrZpJRYpu4VXMV1qLK3aRghm/WP
7sfdxZsDTNTfwcNrOTclt/s+bKuqiXvvb3B7xWN99G5tZJWG1VdOi9WKt9yq2Q58YuY7GasXoY5a
vRK9wN7ae/B1XqWUVymZKm1UbV+xU4FzaF7BK50Xsm3pqvY/xqqNWDbS0beb+az1YLtGMXdPUPi6
t1Nx1dba04dW2WYwA3HluVCtNT1zNmmAHJMQBvhvBqEOzsmZmICZVZ+vgwPVJvvJnNMkUEjoJq4j
v+Yz5/TMvkZ0xBf2ZgwEzFTdjRZLaUedhwU5ywfQuXsePZG1/YeMo2IeB4KIl9kWWUihEfal5RwK
GThshq2NgUMG56iP7tz2EOp+vy/ji7uuk0tandvwuwYTN9d+Flthpkve9hJEkJL4648sAFSwWE8x
eTAAvF5OwenZ2dMFKURYtO/jX3YwhWlHGUUmI+FKFIl5GItQPGxYnVccdfHYYBEzStYTiMwQWiVd
KQiFa346RaLkriV8ahK5R9DbGC4j010Z/z0RBjiQuXUX6skEm/7tNjFfg7qP313tq0NWCwjT3sCM
xOMOUOJksvcHJxWluYXLUaNWjlGptlBWWClG2lqjTlROWS8ZTRS5yVOWkWcGCIPznxOPovjQZxEG
G4hiQuMDFydLcGx2JVYb9W6qlOFOPPJ3Rk43xpHNJEZcCFzkNi56RkyjiRewNfJzatna2q6divKJ
ia9SendTbIbHKl70x5kOxqrphqtEuQinZ3sXbx7mLC9vd+NasWnPuaqu7O6lYmbcFdnDbhfktOEc
mJVh3ENFFDxlEGq9gQouRxzujnbRwYwMYtgRsq10MwrluMjG5g4xkuuZ69uTlpo1Z46rOrVaEYc8
nKHNjC+OsOMeZH+I3Rz3Xc62jGU8Zm/SVKYVwujC6kB2sRSTVQhMIook14qGTqclM6Ku67GnZezx
bJ2OiksYym5Vba1XRqyXEy6mhwMF9WDuVvbcVz0THuCedt879bquG1pjuuC7dkRjHfx3tLEYaO3G
zREFmJCykXaMrsWJjCE9zYvbtt7RjeyWtnfE7e+mKcG54xk4ZXks243oZK7mje7VGlsFfoDydUfB
1b+rbLay5W40ral+2k2qoHc4uaOLTQrFs7au/JorbpNM1EMmTLC2GAURibJnJdScbMeS1t1m54uu
rq9bDVuf4IdgfeQ06cuVZ1pjnduu4cpxXItxzzpSmS2LJW5ZSLla3d12MdWmrv525tLmGbs2Vwls
dbTS6212sWvDibHa64hwcZwbm/g48noeoj/Bhu35dc6Z87uum1d0tc+m/p02KOGTbdy2rsTzii6J
lfVozylTs2Y3Rwr24dfYkimkbG5siqzi3zcuIW13KTHNPGtOOe+rRTC11zHgxumJpOjY5s2OBk0w
yaZUVZM16zsSuSc8KNsw2qKYNyV9W7W9MacGUX8VGC7Y1btXAoqvyraLY436X2sui667Ahv2ZJpk
Zb8V7c4ENFaHBnsa6J2xz55XzOC7DCObmppvqxUasOMaWyZczJdodTVveBGxyauj7DjEm5J1JLs2
vFQUTwcW1VDkwTgpGxhFkLlSJVcFL0MoRhkl+zT0eeNKVKymvsv87Kdn1ucHWQMH3sez9Zint6zZ
uNoSsMInMTvEpDRVVWAW45fpPQBS+owTF8AFI5jzx/Z6d+0MxMAkKCknwD0evt0FCwP++QXdxXtP
bkKRjj/pQx9ox8ShlPmZ6B3v8PnO+YxxkjLccHfQpU+YfkSXXU/umxvjB5j5QcXtTDzEoj8j2PuY
MI+zAVgkYQgsfpjNEUSKJiNhv30+uA/hi/FqQ2hZ9z74/F/Bg3tsP6fyf7Nz8/9CI3N6Nj/NSNuW
+W7Y/pCM42sW5V2/0mdW91Ss7xgq4JaJgoR3coiFFTSkJd6WD9q5kSfTHWxVd5xPL8YiL49XR3uj
e+r+bveCz86xfkvf6fv81mMdy9R/BVntaaucOHc84vzXMnB/X+v/10R+r2EPoI9dkylE9z/l9MCL
lDs/Sh7hI7XUl2Xx1+KvWueN6+9TxVZL2qrwup9bb615uSa/D5KL4i/a8WHU8VOt+b/JRHyfqYvZ
mRi3QGB1I9ihgwf3gJQi8XOHuWvmW8pSM3NayXvWou+rfExK+JhW0Ku15dF7xwxvjUO/1NXk9brb
ohntIouaNzFBZD1Qdf9tjwYb3U3vY5KMzzaukMn8RvdbnH2oep7SromA1kh3Hz8Ey+3xl9aTqKKS
XKHoWVWXC5tce2i8tFmInkMDtLMFVARE8xUpGRUzBmA8J4+7jjlb2nKuXftx/bnOdprTB7kreWnk
+PpvbMGCeLQdQ8q8aFJhRq5iYRfCDc/l1w17lGrbFllmvtddGEzldGx8YCSGfT3snAhk6+2jV4tj
vdN7sfN3RKkJUiJRT9IxXL41rxfLxo+XP4qngfI+IgiPp8FCgJTTynAaUi8+gmDoK9JzEG4hEd0J
et63WmFntzYGf2x84LcuUKOhBZTFsUiKPmUfJwb4j0LNhimXQS3iUR6zm3MEHblG9L5NpumhMUeL
ybHZ6/LlLywe/BQheveLF6JzZPdewpxe2kYPJDIhMRvQ3xHCPoHQdQ3pdp20h7Us4uqtEYovUiyX
UlMIk7SDwfvi6A2x90Semx6Uomk/bBn8ju7VEUKxWntIYqqv9UiqryjYohXnc/YWhKXGPELOwQxX
RZPV+0ij0fEfK7g74/Y6IJYuEffJyQ98dReR4P93rjb5CNDOPjCKuq0fYs4x+p9uInV9LxZsIN6I
63jG6M5UVRGULOTeo8DKIHRi/wrCMHJaPnGxERGknM6Mz6SNY734KIi5ZKXqhFHnGbuj8C3wicIv
GxEYKC2UB5mDCCM+6PKnW4P3er4LPf8J2/Ne2Kt5g8TviC0LclEvc7R9b6nzYIR8Y0eyGw0HW7Ex
K6HsaqMl69Ci9VjWtKIqlL9KxCrBP3lD8IqUQyGp8KRGUQ2Q0hEjBKizBJSIXC7fG9RrF5HuIYmM
LkggIJChFTLEBQ1iRSdjrQUU94kaENL5iMIPMpWD5uw+EbVy8ZsUR0VMF0LMtiusIvT2TMvuXQ+i
Ko+Tght5vdERZ2vW0bTeymZ+MZRzOyGkMX7lxGXrSJImyH7ino5uD0VNHPy5WYvp907HEMm1k5kO
+Nr2F6kJUmKTETCOt0GColsbU3PT6vSM0QuYJj8pb16iGponSSOxEfZMBaGS0oecdY9fkyqtF4vm
MEwG2XbrbMYKxiFkWc8J+xkqqmPY6llYUWpMozrHnEpiL4D8A94X1iGc3SSkSiVaQvdG5+UTvL3l
DDe/4bmDLplKfhUmUdQqXoij7kqkwlIbYk8orpT5zNJiSUgkENRxMCwdHS1CMsKDnEMVoizue08C
q9sS3wjA5DaV0s1pvPVOyjLnR97rfHCNSWG5cUUVRkm9PreAwOKzJcoonkSmUJJlOMZEtYCw/AKr
0XRHhMRKYQsxomzriItgvQ1cg8Gxm+2yI7ZKtjfuol1m25ZlEFzA2JQ3QKREjktUi9hCLeqKlCPx
jF7UR4IiJR9kfeRc6iLo3Rk7Qr84DcLN/VM/FCsdHg5r1YR8fZXJZsDv/qVMYQ+uL40VdT4ELzuW
RQ7RelESvbJYhcuJPuDo9ysbXm455nZGajCOhCjlHEj2wiskdiUUcZ+cbqRae4h6jqjybDa7SJiD
cRtjreo7Sh6y+KOs/J0eEM4RexNsTVvKwM4lBHmXmmP/F3JFOFCQWBt/8w==

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13  5:14                   ` Paul Eggert
@ 2011-04-13  6:31                     ` Jim Meyering
  2011-04-13  6:37                     ` Eli Zaretskii
                                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 49+ messages in thread
From: Jim Meyering @ 2011-04-13  6:31 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, emacs-devel

Paul Eggert wrote:
> On 04/12/2011 08:53 AM, Paul Eggert wrote:
>> I can do it, in the next day or two.
>
> OK, I did it, by auditing all uses of emacs_read and emacs_write to
> find and fix all negative buffer sizes.  However, instead of negative
> sizes I found overflows and some other issues, described below; and I
> therefore now agree with Eli that emacs_write shouldn't use the same
> signature as POSIX 'write'.  However, I also agree with Jim that
> emacs_write's API need not artificially limit buffer sizes to SSIZE_MAX.
>
> Here's the key idea of the attached patch.  Since
> "N = emacs_write (FD, BUF, SIZE)" returns N if successful, and some
> value less than N if it fails, the caller can easily determine whether
> it succeeded by testing N == SIZE.  So on failure there's no need for
> emacs_write to return a special value like -1.
>
> With this in mind, emacs_write can simply return the number of bytes
> written, which will always be a size_t value.  Jim, this idea is
> already used by gnulib's full_write function, so there's good
> precedent for departing from the POSIX signature here.  And Eli, this
> avoids the problem where the size is so large that a signed return
> value would overflow and become negative.
>
> Here are some other bugs in Emacs that I found as part of this audit:
>
>  * Several places in sound.c attempt to stuff a size into an 'int',
>    which doesn't work.  This problem is independent of the above, and
>    its fix is independent of the above too, so it's split into a
>    separate patch in the attached bundle.
>
>  * send_process attempts to stuff a size into an 'int' as well.  This
>    can be fixed by changing one of its local variables from int to size_t.
>
>  * emacs_gnutls_write has the same issues as emacs_write, and should
>    be fixed the same way.
>
> Attached is a bzr bundle of two proposed patches to fix the problem as
> described above.  Comments are welcome.

That looks like a fine change.
Thanks for doing the work.
The improved ease of use makes this change worthwhile to me.
No more hassling with a mix of signed and unsigned lengths,
and thus less risk of misuse and fewer compiler warnings.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13  5:14                   ` Paul Eggert
  2011-04-13  6:31                     ` Jim Meyering
@ 2011-04-13  6:37                     ` Eli Zaretskii
  2011-04-13  8:15                       ` Paul Eggert
  2011-04-13  6:49                     ` Eli Zaretskii
  2011-04-13 14:35                     ` Ted Zlatanov
  3 siblings, 1 reply; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-13  6:37 UTC (permalink / raw)
  To: Paul Eggert; +Cc: jim, emacs-devel

> Date: Tue, 12 Apr 2011 22:14:48 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: jim@meyering.net, emacs-devel@gnu.org
> 
> However, I also agree with Jim that
> emacs_write's API need not artificially limit buffer sizes to SSIZE_MAX.

Emacs cannot have buffers (or any other streams of bytes) that are
larger than SSIZE_MAX, because a small number of bits is reserved for
the Lisp tags.  So complicating the users of emacs_write in order to
support such large writes is pointless, and targets theoretical cases
that cannot happen in Emacs.

Emacs is not a general-purpose library.  It doesn't have to cater to
use cases that can never happen without radical changes in basic
architecture and data types, that would mean a complete rewrite of the
entire package.

> Here's the key idea of the attached patch.  Since
> "N = emacs_write (FD, BUF, SIZE)" returns N if successful, and some
> value less than N if it fails, the caller can easily determine whether
> it succeeded by testing N == SIZE.  So on failure there's no need for
> emacs_write to return a special value like -1.
> 
> With this in mind, emacs_write can simply return the number of bytes
> written, which will always be a size_t value.

This will not allow us support situations where a call to emacs_write
or similar interfaces (such as emacs_gnutls_write) can legitimately
write only a portion of what it was required to.  The current code
handles this situation (by looping for what's left to write), while
your suggested code will treat that as a fatal error.  Are we sure
that all such cases are irrecoverable errors?  If not, and if this
patch is accepted, should we perhaps test more pertinent values of
errno than what we do now, to avoid failing unnecessarily?

>  * Several places in sound.c attempt to stuff a size into an 'int',
>    which doesn't work.  This problem is independent of the above, and
>    its fix is independent of the above too, so it's split into a
>    separate patch in the attached bundle.

I didn't touch those parts because I don't know enough about the
underlying sound APIs.  Someone who does should review the proposed
changes and verify that they don't break sound support with those
devices.  We cannot assume that a given hardware device is 64-bit- and
signedness-clean as it should be.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13  5:14                   ` Paul Eggert
  2011-04-13  6:31                     ` Jim Meyering
  2011-04-13  6:37                     ` Eli Zaretskii
@ 2011-04-13  6:49                     ` Eli Zaretskii
  2011-04-13 14:35                     ` Ted Zlatanov
  3 siblings, 0 replies; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-13  6:49 UTC (permalink / raw)
  To: Paul Eggert; +Cc: jim, emacs-devel

> Date: Tue, 12 Apr 2011 22:14:48 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: jim@meyering.net, emacs-devel@gnu.org
> 
> And Eli, this avoids the problem where the size is so large that a
> signed return value would overflow and become negative.

The concern was not only about that.  It was also about a mistaken
call to emacs_write with a negative value in the last argument.  Using
size_t there will cause that negative value to appear as a large
positive value within emacs_write, and will potentially crash Emacs
because the call to write will try to reference memory outside of the
Emacs address space.

That danger still exists with your proposed version of emacs_write,
AFAICS.  At the very least, we should have at the beginning of
emacs_write something like this:

  if ((ssize_t) nbyte < 0)
    return nbyte;



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13  6:37                     ` Eli Zaretskii
@ 2011-04-13  8:15                       ` Paul Eggert
  2011-04-13  9:46                         ` Eli Zaretskii
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2011-04-13  8:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jim, emacs-devel

On 04/12/2011 11:37 PM, Eli Zaretskii wrote:

> The current code handles this situation (by looping for what's left
> to write), while your suggested code will treat that as a fatal error.

No, the suggested code also loops for what's left to write.
Perhaps you misread the code? (or am I misunderstanding your comment?)

> Emacs cannot have buffers (or any other streams of bytes) that are
> larger than SSIZE_MAX, because a small number of bits is reserved for
> the Lisp tags.

That kind of argument violates abstraction boundaries: sysdep.c
is supposed to be about system things, and it's not supposed to
rely on assumptions about Emacs Lisp internals.  There may be a
few violations of these abstraction boundaries now, but let's
not make this worse.

For example, it would be a fairly small change to make Emacs buildable on
a machine with 32-bit pointers and 64-bit EMACS_INT.  And this would
have real advantages: on 32-bit hosts it would remove the arbitrary
(and really annoying :-) 256 MiB limit on editable files.  But it
would mean that Emacs *could* have buffers larger than SSIZE_MAX.

emacs_write should not stand in the way of plausible improvements
such as this one.

> It was also about a mistaken
> call to emacs_write with a negative value in the last argument...
> 
> That danger still exists with your proposed version of emacs_write,

No it doesn't, because I've carefully audited all the Emacs code
(which is how I found all those *other* bugs :-).  With the
proposed change, emacs_write is never called with a negative
argument.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13  8:15                       ` Paul Eggert
@ 2011-04-13  9:46                         ` Eli Zaretskii
  2011-04-13 16:06                           ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-13  9:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: jim, emacs-devel

> Date: Wed, 13 Apr 2011 01:15:01 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: jim@meyering.net, emacs-devel@gnu.org
> 
> On 04/12/2011 11:37 PM, Eli Zaretskii wrote:
> 
> > The current code handles this situation (by looping for what's left
> > to write), while your suggested code will treat that as a fatal error.
> 
> No, the suggested code also loops for what's left to write.
> Perhaps you misread the code? (or am I misunderstanding your comment?)

I was talking about the calls in process.c.  It says now

		    if (XPROCESS (proc)->gnutls_p)
		      rv = emacs_gnutls_write (outfd,
					       XPROCESS (proc),
					       buf, this);
		    else
		      rv = emacs_write (outfd, buf, this);
		      ...
		if (rv < 0)
		  {
		    if (errno == EWOULDBLOCK || errno == EAGAIN)
		      {
		        ...
			rv = 0;
		      }
		    else
		      /* This is a real error.  */
		      report_file_error ("writing to process", Fcons (proc, Qnil));
		  }
		buf += rv;
		len -= rv;
		this -= rv;

and it continues looping if rv is positive.  With your change, it
looks like this:

		    if (XPROCESS (proc)->gnutls_p)
		      written = emacs_gnutls_write (outfd,
					       XPROCESS (proc),
					       buf, this);
		    else
		      written = emacs_write (outfd, buf, this);
		    rv = (written == this ? 0 : -1);
		      ...
		if (rv < 0)
		  {
		    if (errno == EWOULDBLOCK || errno == EAGAIN)
		      {
		        ...
			rv = 0;
		      }
		    else
		      /* This is a real error.  */
		      report_file_error ("writing to process", Fcons (proc, Qnil));
		  }
		buf += rv;
		len -= rv;
		this -= rv;

and it will (with enough luck) exit the loop through
report_file_error, because `rv' becomes negative if `written' is not
equal to `this'.  I was asking whether testing errno for EWOULDBLOCK
and EAGAIN, and the code that deals with when that happens, are good
enough for all the possible reasons that emacs_write and
emacs_gnutls_write could return a partial count of bytes.

> > Emacs cannot have buffers (or any other streams of bytes) that are
> > larger than SSIZE_MAX, because a small number of bits is reserved for
> > the Lisp tags.
> 
> That kind of argument violates abstraction boundaries: sysdep.c
> is supposed to be about system things, and it's not supposed to
> rely on assumptions about Emacs Lisp internals.

sysdep.c is not about system things, it's about Emacs code that
requires platform-dependent techniques.  Most of that indeed deals
with system types, but you will find quite a few Emacs specific
internals there: Lisp_Object and EMACS_TIME data types, calls to
`intern' and Fsleep_for, use of macros such as STRINGP and
FRAME_TERMCAP_P, etc.

> For example, it would be a fairly small change to make Emacs buildable on
> a machine with 32-bit pointers and 64-bit EMACS_INT.

Somehow, I doubt it is a small change.  But if it is, by all means
let's do it now!  What are we waiting for?

>  And this would
> have real advantages: on 32-bit hosts it would remove the arbitrary
> (and really annoying :-) 256 MiB limit on editable files.

Since Emacs 23.2, it's 512 MB, btw, not 256.

> emacs_write should not stand in the way of plausible improvements
> such as this one.

Such a configuration (if it indeed is possible "easily") will need to
revamp several calls to emacs_write anyway, so what type we use there
now is a moot point.

> > It was also about a mistaken
> > call to emacs_write with a negative value in the last argument...
> > 
> > That danger still exists with your proposed version of emacs_write,
> 
> No it doesn't, because I've carefully audited all the Emacs code
> (which is how I found all those *other* bugs :-).  With the
> proposed change, emacs_write is never called with a negative
> argument.

Either you care about future changes or you don't.  If you do, then
auditing just the current callers is not enough, you need to account
for the unknown future (or did you audit that as well?).  If you don't
care about the future, then the assumption that the size of a buffer
or string can never overflow SSIZE_MAX is also based on carefully
auditing the Emacs code, and we can rely on it.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13  5:14                   ` Paul Eggert
                                       ` (2 preceding siblings ...)
  2011-04-13  6:49                     ` Eli Zaretskii
@ 2011-04-13 14:35                     ` Ted Zlatanov
  2011-04-15 13:13                       ` Ted Zlatanov
  3 siblings, 1 reply; 49+ messages in thread
From: Ted Zlatanov @ 2011-04-13 14:35 UTC (permalink / raw)
  To: emacs-devel

On Tue, 12 Apr 2011 22:14:48 -0700 Paul Eggert <eggert@cs.ucla.edu> wrote: 

PE>  * emacs_gnutls_write has the same issues as emacs_write, and should
PE>    be fixed the same way.

There's a pending patch (we're waiting for papers for Claudio Bley) that
changes some Emacs GnuTLS internals so I'd prefer to wait to make any
changes to the Emacs GnuTLS functions.  How urgent is this?

Ted




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13  9:46                         ` Eli Zaretskii
@ 2011-04-13 16:06                           ` Paul Eggert
  2011-04-13 17:22                             ` Eli Zaretskii
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2011-04-13 16:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jim, emacs-devel

On 04/13/2011 02:46 AM, Eli Zaretskii wrote:
> I was asking whether testing errno for EWOULDBLOCK
> and EAGAIN, and the code that deals with when that happens, are good
> enough for all the possible reasons that emacs_write and
> emacs_gnutls_write could return a partial count of bytes.

The revised send_process code tests for EWOULDBLOCK and EAGAIN under
the same conditions that it does now.  If the checks are good enough
now, they will continue to be good enough after the proposed change.
So I still see no problem here.

> sysdep.c is not about system things, it's about Emacs code that
> requires platform-dependent techniques.  Most of that indeed deals
> with system types, but you will find quite a few Emacs specific
> internals

Yes, and I had already mentioned that the abstraction boundaries were
sometimes being violated there.  But let's not make matters worse.
sysdep.c is supposed to be for interfaces to system-dependent kernel
and library code, not for access to Emacs Lisp internals.

>> For example, it would be a fairly small change to make Emacs buildable on
>> a machine with 32-bit pointers and 64-bit EMACS_INT.
> 
> Somehow, I doubt it is a small change.  But if it is, by all means
> let's do it now!  What are we waiting for?

I suspect you are joking, but if not, I'd be happy to implement it.
It shouldn't take that long, and it'd be nice to remove that silly 512
MiB limit.  Would you object to such a change?

But at any rate, that would be a different change, and this fix
shouldn't wait on it.

>> emacs_write should not stand in the way of plausible improvements
>> such as this one.
> 
> Such a configuration (if it indeed is possible "easily") will need to
> revamp several calls to emacs_write anyway, so what type we use there
> now is a moot point.

True, and it suggests that we can revisit the issue if this idea is
implemented.  But in the meantime, we have a proposed change that
fixes bugs and doesn't introduce bugs.  Since there are no downsides
to this fix now, and no improvements to it have been suggested, we can
install it now, and worry about future improvements later.

> Either you care about future changes or you don't.

Both.  I care about plausible future changes; and I also realize that
we don't have time to program for all possibilities and need to fix
what's before us, which the proposed patch does.




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13 16:06                           ` Paul Eggert
@ 2011-04-13 17:22                             ` Eli Zaretskii
  2011-04-13 19:31                               ` Paul Eggert
                                                 ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-13 17:22 UTC (permalink / raw)
  To: Paul Eggert; +Cc: jim, emacs-devel

> Date: Wed, 13 Apr 2011 09:06:14 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: jim@meyering.net, emacs-devel@gnu.org
> 
> On 04/13/2011 02:46 AM, Eli Zaretskii wrote:
> > I was asking whether testing errno for EWOULDBLOCK
> > and EAGAIN, and the code that deals with when that happens, are good
> > enough for all the possible reasons that emacs_write and
> > emacs_gnutls_write could return a partial count of bytes.
> 
> The revised send_process code tests for EWOULDBLOCK and EAGAIN under
> the same conditions that it does now.

That's true, but it's not my point.  The original code continued
looping and trying to write the rest of the stuff if emacs_write
returned a positive value that is smaller than what is was asked to
write.  The new code bails out of the loop in that case, assuming that
it's due to an error.

> > sysdep.c is not about system things, it's about Emacs code that
> > requires platform-dependent techniques.  Most of that indeed deals
> > with system types, but you will find quite a few Emacs specific
> > internals
> 
> Yes, and I had already mentioned that the abstraction boundaries were
> sometimes being violated there.  But let's not make matters worse.
> sysdep.c is supposed to be for interfaces to system-dependent kernel
> and library code, not for access to Emacs Lisp internals.

I was trying to point out that you are mistaken in your interpretation
of what sysdep.c is.  But even I accept your POV, access to OS-level
features does not mean we cannot use Emacs-specific data types there
or forget about Emacs design.  sysdep.c is not a library, it's part of
an application.

> >> For example, it would be a fairly small change to make Emacs buildable on
> >> a machine with 32-bit pointers and 64-bit EMACS_INT.
> > 
> > Somehow, I doubt it is a small change.  But if it is, by all means
> > let's do it now!  What are we waiting for?
> 
> I suspect you are joking, but if not, I'd be happy to implement it.

I'm not joking.  How could I?

> It shouldn't take that long, and it'd be nice to remove that silly 512
> MiB limit.  Would you object to such a change?

Of course not!  Why should I object to a great feature like that?

> in the meantime, we have a proposed change that
> fixes bugs and doesn't introduce bugs.  Since there are no downsides
> to this fix now, and no improvements to it have been suggested, we can
> install it now, and worry about future improvements later.

I'll let Stefan and Chong decide on that.  I still think that using
ssize_t as the last argument to emacs_write is better.

> > Either you care about future changes or you don't.
> 
> Both.  I care about plausible future changes; and I also realize that
> we don't have time to program for all possibilities and need to fix
> what's before us, which the proposed patch does.

IOW, you care for what supports your arguments and don't care about
the rest.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13 17:22                             ` Eli Zaretskii
@ 2011-04-13 19:31                               ` Paul Eggert
  2011-04-13 19:59                               ` PJ Weisberg
  2011-04-13 20:02                               ` Paul Eggert
  2 siblings, 0 replies; 49+ messages in thread
From: Paul Eggert @ 2011-04-13 19:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jim, emacs-devel

On 04/13/2011 10:22 AM, Eli Zaretskii wrote:

>> it'd be nice to remove that silly 512 MiB limit.
>> Would you object to such a change?
> 
> Of course not!  Why should I object to a great feature like that?

OK, thanks, I'll take a look at it.  This won't happen instantly.

> The original code continued looping and trying to write the rest of
> the stuff if emacs_write returned a positive value that is smaller
> than what is was asked to write.  The new code bails out of the loop
> in that case, assuming that it's due to an error.

Ah, thanks, now I see what you mean.  The old code ignores write error
in some cases, whereas the new code carefully reports the first write
error it sees.  Yes, that is a separate issue; I think it's a better
approach, but it should be analyzed separately as it might lead to
incompatibilities.  The following further patch removes that part of
the change, so that the new code (like the old) ignores the first
write error after a partial write.

* process.c (send_process): Count partial writes as successes.
See http://lists.gnu.org/archive/html/emacs-devel/2011-04/msg00483.html
=== modified file 'src/process.c'
--- src/process.c	2011-04-13 05:02:54 +0000
+++ src/process.c	2011-04-13 19:20:49 +0000
@@ -5396,7 +5396,7 @@
 		  else
 #endif
 		    written = emacs_write (outfd, buf, this);
-		  rv = (written == this ? 0 : -1);
+		  rv = (written ? 0 : -1);
 #ifdef ADAPTIVE_READ_BUFFERING
 		  if (p->read_output_delay > 0
 		      && p->adaptive_read_buffering == 1)




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13 17:22                             ` Eli Zaretskii
  2011-04-13 19:31                               ` Paul Eggert
@ 2011-04-13 19:59                               ` PJ Weisberg
  2011-04-14  4:49                                 ` Eli Zaretskii
  2011-04-13 20:02                               ` Paul Eggert
  2 siblings, 1 reply; 49+ messages in thread
From: PJ Weisberg @ 2011-04-13 19:59 UTC (permalink / raw)
  Cc: emacs-devel

On 4/13/11, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Wed, 13 Apr 2011 09:06:14 -0700
>> From: Paul Eggert <eggert@cs.ucla.edu>
>> CC: jim@meyering.net, emacs-devel@gnu.org
>>
>> On 04/13/2011 02:46 AM, Eli Zaretskii wrote:
>> > Either you care about future changes or you don't.
>>
>> Both.  I care about plausible future changes; and I also realize that
>> we don't have time to program for all possibilities and need to fix
>> what's before us, which the proposed patch does.
>
> IOW, you care for what supports your arguments and don't care about
> the rest.

Always assume good faith.  It's far more likely that he's making the
argument because his priorities are different from yours than that
he's adjusting his priorities to support the argument.  :-)

-PJ



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13 17:22                             ` Eli Zaretskii
  2011-04-13 19:31                               ` Paul Eggert
  2011-04-13 19:59                               ` PJ Weisberg
@ 2011-04-13 20:02                               ` Paul Eggert
  2 siblings, 0 replies; 49+ messages in thread
From: Paul Eggert @ 2011-04-13 20:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jim, emacs-devel

On 04/12/2011 11:37 PM, Eli Zaretskii wrote:
> We cannot assume that a given hardware device is 64-bit- and
> signedness-clean as it should be.

That's true, and it's independent of the other issues that we're
discussing.  And it comes up with all I/O, not just writes
to sound devices, so it should be fixed in emacs_write (and
emacs_read), independently of the other changes.  This is
the MAX_RW_COUNT issue I mentioned in
<http://lists.gnu.org/archive/html/emacs-devel/2011-04/msg00462.html>.

Here's a simple patch which fixes the problem.  A platform that is
known not to have the bug can "#define MAX_RW_COUNT SIZE_MAX".

* sysdep.c (MAX_RW_COUNT): New macro, to work around kernel bugs.
(emacs_read, emacs_write): Use it.
=== modified file 'src/sysdep.c'
--- src/sysdep.c	2011-04-13 05:02:54 +0000
+++ src/sysdep.c	2011-04-13 19:52:07 +0000
@@ -1825,6 +1825,17 @@
   return rtnval;
 }

+/* Maximum number of bytes to read or write in a single system call.
+   This works around a serious bug in Linux kernels before 2.6.16; see
+   <https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=612839>.
+   It's likely to work around similar bugs in other operating systems, so do it
+   on all platforms.  Round INT_MAX down to a page size, with the conservative
+   assumption that page sizes are at most 2**18 bytes (any kernel with a
+   page size larger than that shouldn't have the bug).  */
+#ifndef MAX_RW_COUNT
+#define MAX_RW_COUNT (INT_MAX >> 18 << 18)
+#endif
+
 /* Read from FILEDESC to a buffer BUF with size NBYTE, retrying if interrupted.
    Return the number of bytes read, which might be less than NBYTE.
    On error, set errno and return -1.  */
@@ -1833,7 +1844,7 @@
 {
   register ssize_t rtnval;

-  while ((rtnval = read (fildes, buf, nbyte)) == -1
+  while ((rtnval = read (fildes, buf, min (nbyte, MAX_RW_COUNT))) == -1
 	 && (errno == EINTR))
     QUIT;
   return (rtnval);
@@ -1852,7 +1863,7 @@

   while (nbyte > 0)
     {
-      rtnval = write (fildes, buf, nbyte);
+      rtnval = write (fildes, buf, min (nbyte, MAX_RW_COUNT));

       if (rtnval < 0)
 	{




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13 19:59                               ` PJ Weisberg
@ 2011-04-14  4:49                                 ` Eli Zaretskii
  0 siblings, 0 replies; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-14  4:49 UTC (permalink / raw)
  To: PJ Weisberg; +Cc: emacs-devel

> Date: Wed, 13 Apr 2011 12:59:09 -0700
> From: PJ Weisberg <pj@irregularexpressions.net>
> Cc: emacs-devel@gnu.org
> 
> Always assume good faith.

Oh, save me the lecture.  Go read the entire thread (and a few similar
ones); perhaps then you will understand what happened with my
assumption of good faith.  One can only hold to it this much.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-11 21:54           ` Jim Meyering
  2011-04-12  4:44             ` Eli Zaretskii
  2011-04-12 13:24             ` Ted Zlatanov
@ 2011-04-14 20:57             ` Michael Welsh Duggan
  2 siblings, 0 replies; 49+ messages in thread
From: Michael Welsh Duggan @ 2011-04-14 20:57 UTC (permalink / raw)
  To: Jim Meyering; +Cc: David Kastrup, emacs-devel

Jim Meyering <jim@meyering.net> writes:

> David Kastrup wrote:
>> Jim Meyering <jim@meyering.net> writes:
>>
>>> What about when EMACS_INT is defined to "int"?
>>>
>>> Someone will inevitably call your write-like function
>>> with a length of type size_t -- many existing uses do just that --
>>> and by using a signed type, you will have converted their long
>>> yet valid (2-4GiB), buffer length, into a negative number.
>>
>> Resulting in an error or nothing happening.  In contrast, if a negative
>> number is turned into a long yet valid (2-4GiB) number, it is very
>> likely that unintended memory areas will get stomped over.
>
> Since someone mentioned a goal of being able to edit a 2GiB
> file, I thought people here would be sensitive to an API policy
> that renders that goal unreachable on some systems.
>
> Of course the extra bit doesn't matter when you start with 64,
> but starting with only 32, using an unsigned "length" would
> avoid that unnecessary limit.

Forgive me if this is already known, but I believe the POSIX standard
says that "If the value of nbyte is greater than {SSIZE_MAX}, the result
is implementation-defined."  This implies that there is no guarantee
that the extra bit would ever be reliably used anyway.

-- 
Michael Welsh Duggan
(md5i@md5i.com)



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-12 15:53                 ` Paul Eggert
                                     ` (2 preceding siblings ...)
  2011-04-13  5:14                   ` Paul Eggert
@ 2011-04-15  1:29                   ` Stefan Monnier
  2011-04-15  8:55                     ` Paul Eggert
  3 siblings, 1 reply; 49+ messages in thread
From: Stefan Monnier @ 2011-04-15  1:29 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, jim, emacs-devel

> If the code is well-defended against passing negative sizes,
> then we can remove the checks entirely; they're not needed.
> They were meant only as a stopgap in case the callers were
> buggy.  (Also please see below.)

The old approach of being somewhat permissive worked well for decades,
I see no reason to change it.

> Any compiler that will warn about conversions to size_t will
> warn about hundreds of other places in Emacs.  These warnings
> are more trouble than they're worth, and we shouldn't worry
> about them.

I hate casts and generally hate unsigned integers as well, so please
don't introduce changes that add uses of casts or unsigned integers
unless absolutely necessary.

>>> with the goal of removing the runtime checks once we have
>>> carefully checked that they aren't needed.
>> Which will never happen, so these aborts will stay in the code
>> forever.
> I can do it, in the next day or two.  It shouldn't be a problem.

That's not how it works: Emacs will keep being changed in the future.

> I'm sorry if there are hurt feelings about this, but size_t is
> clearly the better choice for buffer sizes;

No, it's not.  Maybe it's your favorite and it probably is the standard
in the C API, but that doesn't make it "clearly the better choice".
E.g. it's a choice I strongly dislike.  The only cases where I like to
use unsigned integers is when we treat them as bitvectors, all other
cases end up creating errors because they eventually meet with signed
types introducing unexpected behaviors.  This is particularly true in
the context of Emacs where most/many integer values go through Elisp
where they are always signed.


        Stefan



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-15  1:29                   ` Stefan Monnier
@ 2011-04-15  8:55                     ` Paul Eggert
  2011-04-15  9:41                       ` Eli Zaretskii
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2011-04-15  8:55 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, jim, emacs-devel

On 04/14/2011 06:29 PM, Stefan Monnier wrote:

> please don't introduce changes that add uses of casts or unsigned integers
> unless absolutely necessary.

Heh.  And I just today was working on a patch that removes dozens of casts
from lisp.h.  Maybe I'll have better luck with that one. :-)

Anyway, I changed all the size_ts to EMACS_INTs and committed the patch
(which fixed other bugs as already mentioned) as bzr 103919.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-15  8:55                     ` Paul Eggert
@ 2011-04-15  9:41                       ` Eli Zaretskii
  2011-04-15 10:24                         ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Eli Zaretskii @ 2011-04-15  9:41 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel, monnier, jim

> Date: Fri, 15 Apr 2011 01:55:52 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> Cc: Eli Zaretskii <eliz@gnu.org>, jim@meyering.net, emacs-devel@gnu.org
> 
> Anyway, I changed all the size_ts to EMACS_INTs and committed the patch
> (which fixed other bugs as already mentioned) as bzr 103919.

The change you made in emacs_read makes it possible for it to read
less than it was requested to:

  -  while ((rtnval = read (fildes, buf, nbyte)) == -1
  +  while ((rtnval = read (fildes, buf, min (nbyte, MAX_RW_COUNT))) == -1
	   && (errno == EINTR))
       QUIT;
     return (rtnval);
   }

I don't think it's a good idea to silently truncate the request and
rely on the callers to loop (although I think they all do for now).
So I think we need a loop in emacs_read like what we do in
emacs_write.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-15  9:41                       ` Eli Zaretskii
@ 2011-04-15 10:24                         ` Paul Eggert
  0 siblings, 0 replies; 49+ messages in thread
From: Paul Eggert @ 2011-04-15 10:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, monnier, jim

On 04/15/2011 02:41 AM, Eli Zaretskii wrote:
> I don't think it's a good idea to silently truncate the request

OK, I removed that part of the patch.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-13 14:35                     ` Ted Zlatanov
@ 2011-04-15 13:13                       ` Ted Zlatanov
  2011-04-15 16:34                         ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Ted Zlatanov @ 2011-04-15 13:13 UTC (permalink / raw)
  To: emacs-devel

On Wed, 13 Apr 2011 09:35:00 -0500 Ted Zlatanov <tzz@lifelogs.com> wrote: 

TZ> On Tue, 12 Apr 2011 22:14:48 -0700 Paul Eggert <eggert@cs.ucla.edu> wrote: 
PE> * emacs_gnutls_write has the same issues as emacs_write, and should
PE> be fixed the same way.

TZ> There's a pending patch (we're waiting for papers for Claudio Bley) that
TZ> changes some Emacs GnuTLS internals so I'd prefer to wait to make any
TZ> changes to the Emacs GnuTLS functions.  How urgent is this?

Paul, did you already do these fixes in revno 103919?

I hope rebasing Claudio Bley's GnuTLS patch on top of your fixes is
trivial.  If not I may have to ask you for help later.

Ted




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-15 13:13                       ` Ted Zlatanov
@ 2011-04-15 16:34                         ` Paul Eggert
  2011-04-15 18:20                           ` Ted Zlatanov
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2011-04-15 16:34 UTC (permalink / raw)
  To: emacs-devel

On 04/15/2011 06:13 AM, Ted Zlatanov wrote:
> Paul, did you already do these fixes in revno 103919?

Yes.  The APIs for emacs_gnutls_read and emacs_gnutls_write
had already changed recently, in bzr 103878,
and I figured that if merging will be needed anyway
we might as well merge the desired API.

> I hope rebasing Claudio Bley's GnuTLS patch on top of your fixes is
> trivial.  If not I may have to ask you for help later.

It should be trivial, and I'll be happy to help if it isn't.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: oops? read/write vs type of length parameter
  2011-04-15 16:34                         ` Paul Eggert
@ 2011-04-15 18:20                           ` Ted Zlatanov
  0 siblings, 0 replies; 49+ messages in thread
From: Ted Zlatanov @ 2011-04-15 18:20 UTC (permalink / raw)
  To: emacs-devel

On Fri, 15 Apr 2011 09:34:58 -0700 Paul Eggert <eggert@cs.ucla.edu> wrote: 

PE> On 04/15/2011 06:13 AM, Ted Zlatanov wrote:
>> Paul, did you already do these fixes in revno 103919?

PE> Yes.  The APIs for emacs_gnutls_read and emacs_gnutls_write
PE> had already changed recently, in bzr 103878,
PE> and I figured that if merging will be needed anyway
PE> we might as well merge the desired API.

>> I hope rebasing Claudio Bley's GnuTLS patch on top of your fixes is
>> trivial.  If not I may have to ask you for help later.

PE> It should be trivial, and I'll be happy to help if it isn't.

I posted a rebased patch with an emacs_gnutls_write change required
because of your revisions.

I think the old code returned -1 for 0 bytes to indicate the read should
be retried, but the new code doesn't have that return.  Still it works,
since the GnuTLS-specific EAGAIN (GNUTLS_E_AGAIN) is its own error code.
So the old behavior of returning -1 for 0 bytes was actually unnecessary.

I am using the patch now and have had no problems.

Ted




^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2011-04-15 18:20 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-11  8:55 oops? read/write vs type of length parameter Jim Meyering
2011-04-11  9:44 ` Eli Zaretskii
2011-04-11 11:08   ` Jim Meyering
2011-04-11 11:28     ` David Kastrup
2011-04-11 11:52     ` Eli Zaretskii
2011-04-11 12:27       ` Jim Meyering
2011-04-11 12:31         ` David Kastrup
2011-04-11 21:54           ` Jim Meyering
2011-04-12  4:44             ` Eli Zaretskii
2011-04-12 13:24             ` Ted Zlatanov
2011-04-12 13:29               ` Eli Zaretskii
2011-04-12 14:47                 ` Ted Zlatanov
2011-04-12 17:00                   ` Large file support (was: oops? read/write vs type of length parameter) Eli Zaretskii
2011-04-14 20:57             ` oops? read/write vs type of length parameter Michael Welsh Duggan
2011-04-11 14:02         ` Eli Zaretskii
2011-04-11 11:40   ` Stephen J. Turnbull
2011-04-11 13:58     ` Eli Zaretskii
2011-04-12  1:16       ` Paul Eggert
2011-04-12  3:01         ` Eli Zaretskii
2011-04-12  5:06           ` Paul Eggert
2011-04-12  5:46             ` Eli Zaretskii
2011-04-12  8:19             ` Paul Eggert
2011-04-12  9:41               ` Eli Zaretskii
2011-04-12 15:53                 ` Paul Eggert
2011-04-12 16:56                   ` Eli Zaretskii
2011-04-12 23:55                   ` Juanma Barranquero
2011-04-13  5:14                   ` Paul Eggert
2011-04-13  6:31                     ` Jim Meyering
2011-04-13  6:37                     ` Eli Zaretskii
2011-04-13  8:15                       ` Paul Eggert
2011-04-13  9:46                         ` Eli Zaretskii
2011-04-13 16:06                           ` Paul Eggert
2011-04-13 17:22                             ` Eli Zaretskii
2011-04-13 19:31                               ` Paul Eggert
2011-04-13 19:59                               ` PJ Weisberg
2011-04-14  4:49                                 ` Eli Zaretskii
2011-04-13 20:02                               ` Paul Eggert
2011-04-13  6:49                     ` Eli Zaretskii
2011-04-13 14:35                     ` Ted Zlatanov
2011-04-15 13:13                       ` Ted Zlatanov
2011-04-15 16:34                         ` Paul Eggert
2011-04-15 18:20                           ` Ted Zlatanov
2011-04-15  1:29                   ` Stefan Monnier
2011-04-15  8:55                     ` Paul Eggert
2011-04-15  9:41                       ` Eli Zaretskii
2011-04-15 10:24                         ` Paul Eggert
2011-04-12 12:32             ` Davis Herring
2011-04-12 13:38               ` Eli Zaretskii
2011-04-12 15:43                 ` Paul Eggert

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).