EMACS_INT vs int for range checking

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* EMACS_INT vs int for range checking
@ 2012-05-26  9:13 Paul Eggert
  2012-05-26 10:11 ` Eli Zaretskii
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Eggert @ 2012-05-26  9:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Emacs Development

Re this patch in trunk bzr 108374:

=== modified file 'src/bidi.c'
--- src/bidi.c	2012-04-09 22:54:59 +0000
+++ src/bidi.c	2012-05-26 07:03:39 +0000
@@ -204,7 +204,7 @@
   val = CHAR_TABLE_REF (bidi_mirror_table, c);
   if (INTEGERP (val))
     {
-      EMACS_INT v = XINT (val);
+      int v = XINT (val);

       if (v < 0 || v > MAX_CHAR)
 	abort ();

It's true that 'val' is supposed to be in range here, and
that if it is in range then 'int' will do.  But the point of
the test '(v < 0 || v > MAX_CHAR)' is to abort if there is
some programming error somewhere that causes 'val' to be out of
range.  Unfortunately the patch means the abort test won't
work reliably on a typical 64-bit host if XINT (val) is (say) 2**32,
which means that the programming error won't be detected reliably.

If we know with absolute certainty that 'val' is in 'int' range,
but there's a reasonable doubt that it's in character range,
a comment explaining this would help clarify why 'v' is int.
However, it may be simpler (and it's no less efficient) to
make v an EMACS_INT.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: EMACS_INT vs int for range checking
  2012-05-26  9:13 EMACS_INT vs int for range checking Paul Eggert
@ 2012-05-26 10:11 ` Eli Zaretskii
  2012-05-26 19:05   ` Paul Eggert
  0 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2012-05-26 10:11 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> Date: Sat, 26 May 2012 02:13:48 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: Emacs Development <emacs-devel@gnu.org>
> 
> It's true that 'val' is supposed to be in range here, and
> that if it is in range then 'int' will do.  But the point of
> the test '(v < 0 || v > MAX_CHAR)' is to abort if there is
> some programming error somewhere that causes 'val' to be out of
> range.  Unfortunately the patch means the abort test won't
> work reliably on a typical 64-bit host if XINT (val) is (say) 2**32,
> which means that the programming error won't be detected reliably.

That's not what that test is about.  That test is about making sure
the result is a valid character code.

If we need protection against overflowing a 32-bit int, it should be
part of CHAR_TABLE_REF, not in every place where CHAR_TABLE_REF is
used.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: EMACS_INT vs int for range checking
  2012-05-26 10:11 ` Eli Zaretskii
@ 2012-05-26 19:05   ` Paul Eggert
  2012-05-27  6:19     ` Eli Zaretskii
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Eggert @ 2012-05-26 19:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 05/26/2012 03:11 AM, Eli Zaretskii wrote:
> If we need protection against overflowing a 32-bit int,
> it should be part of CHAR_TABLE_REF

But character tables can contain any Lisp objects, including
integers greater than INT_MAX, so CHAR_TABLE_REF can't reject
such integers.

> That test is about making sure the result is a valid character code.

Yes, but the current test does not reliably do that.  On a 64-bit host
with 32-bit int it's possible, for example, that bidi_mirror_char can
return a garbage value.  This is because assigning an out-of-int-range
value to an 'int' results in undefined behavior.

If it's the EMACS_INT that's annoying, how about this further patch?
It shortens and clarifies the source code and fixes the portability problem.

=== modified file 'src/bidi.c'
--- src/bidi.c	2012-05-26 07:03:39 +0000
+++ src/bidi.c	2012-05-26 18:22:13 +0000
@@ -204,12 +204,10 @@ bidi_mirror_char (int c)
   val = CHAR_TABLE_REF (bidi_mirror_table, c);
   if (INTEGERP (val))
     {
-      int v = XINT (val);
-
-      if (v < 0 || v > MAX_CHAR)
+      if (! CHAR_VALID_P (XINT (val)))
 	abort ();

-      return v;
+      return XINT (val);
     }

   return c;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: EMACS_INT vs int for range checking
  2012-05-26 19:05   ` Paul Eggert
@ 2012-05-27  6:19     ` Eli Zaretskii
  2012-05-27  7:34       ` Paul Eggert
  0 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2012-05-27  6:19 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> Date: Sat, 26 May 2012 12:05:48 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: emacs-devel@gnu.org
> 
> Yes, but the current test does not reliably do that.  On a 64-bit host
> with 32-bit int it's possible, for example, that bidi_mirror_char can
> return a garbage value.

If that garbage passes the [0..MAX_CHAR] test, it's garbage that the
bidi reordering engine and the rest of redisplay can live with.  Why
should we abort in that case?

> This is because assigning an out-of-int-range value to an 'int'
> results in undefined behavior.

How can that undefined behavior be any worse than aborting??

> If it's the EMACS_INT that's annoying, how about this further patch?
> It shortens and clarifies the source code and fixes the portability problem.

I will only accept such a test as an eassert.  This code is in the
innermost loop of the Emacs display engine, so doing all that juggling
in an optimized build _for_every_character_we_display_ is unacceptable.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: EMACS_INT vs int for range checking
  2012-05-27  6:19     ` Eli Zaretskii
@ 2012-05-27  7:34       ` Paul Eggert
  0 siblings, 0 replies; 5+ messages in thread
From: Paul Eggert @ 2012-05-27  7:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 05/26/2012 11:19 PM, Eli Zaretskii wrote:
> If that garbage passes the [0..MAX_CHAR] test, it's garbage that the
> bidi reordering engine and the rest of redisplay can live with.

No, because the rest of redisplay cannot live with undefined
behavior.  For example, the generated machine code could use
32-bit comparison within bidi_mirror_char, so that (v < 0 ||
v > MAX_CHAR) yields false, but return an untruncated 64-bit
value to the caller, so that the returned value exceeds
MAX_CHAR and messes up the caller.

> How can that undefined behavior be any worse than aborting?

When the undefined behavior doesn't abort -- when it goes on
to cause subtle errors in later computation.

>> If it's the EMACS_INT that's annoying, how about this further patch?
>> It shortens and clarifies the source code and fixes the portability problem.
> 
> I will only accept such a test as an eassert.  This code is in the
> innermost loop of the Emacs display engine, so doing all that juggling
> in an optimized build _for_every_character_we_display_ is unacceptable.

OK, thanks, I installed it as an eassert.  (Though the code
runs equally fast either way with any decent modern
compiler, as most of CHAR_VALID_P is optimized away.)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-05-27  7:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-26  9:13 EMACS_INT vs int for range checking Paul Eggert
2012-05-26 10:11 ` Eli Zaretskii
2012-05-26 19:05   ` Paul Eggert
2012-05-27  6:19     ` Eli Zaretskii
2012-05-27  7:34       ` Paul Eggert

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).