From: Paul Eggert <eggert@cs.ucla.edu>
To: Eli Zaretskii <eliz@gnu.org>, 30408@debbugs.gnu.org
Subject: bug#30408: Checking for loss of information on integer conversion
Date: Tue, 27 Mar 2018 16:19:21 -0700 [thread overview]
Message-ID: <f1e3e23f-7ac2-9840-403b-e4d76f1ed83c@cs.ucla.edu> (raw)
In-Reply-To: <83y3jq9q4m.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 1098 bytes --]
Here's a patch that I hope addresses the main problem. The basic idea is
to avoid the confusion exemplified in Bug#30408 by changing Emacs so
that it ordinarily signals an error if it reads a program that contains
an integer literal that is out of fixnum range. However, if the
out-of-range literal is followed by '.' then Emacs continues to silently
convert it to floating-point; this is intended as an escape hatch for
any programs that need the old behavior (I expect this'll be rare).
Thus, on 32-bit Emacs, plain '536870912' in a program causes Emacs to
signal an overflow while loading the program, whereas '536870912.' is
treated as a floating-point number as before. (On 64-bit Emacs, the same
two literals are both integers, as before.)
Unlike my previous proposal, this patch does not affect the behavior of
string-to-integer. As I understand it, that was a primary source of
qualms about the previous proposal.
I've tested this on both 32- and 64-bit Emacs on master. This patch has
helped me to find a couple of integer portability bugs which I already
fixed on master.
[-- Attachment #2: 0001-Lisp-reader-now-checks-for-integer-overflow.patch --]
[-- Type: text/x-patch, Size: 11154 bytes --]
From 94b7a1a171de3113cd5250315dee7bdef5f51890 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 27 Mar 2018 15:48:47 -0700
Subject: [PATCH] Lisp reader now checks for integer overflow
* doc/lispref/numbers.texi (Integer Basics), etc/NEWS:
Document this.
* src/lisp.h (S2N_IGNORE_TRAILING, S2N_OVERFLOW_TO_FLOAT):
New constants.
* src/lread.c (string_to_number): Change trailing bool arg to
integer argument with flags, to support S2N_OVERFLOW_TO_FLOAT.
All uses changed.
* test/src/editfns-tests.el (read-large-integer): New test.
---
doc/lispref/numbers.texi | 14 ++++++++++----
etc/NEWS | 7 +++++++
src/data.c | 9 ++++-----
src/lisp.h | 3 ++-
src/lread.c | 35 ++++++++++++++++++++---------------
src/process.c | 2 +-
test/src/editfns-tests.el | 22 ++++++++++++++++++----
7 files changed, 62 insertions(+), 30 deletions(-)
diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi
index c2cb6651d4..2fed2b642f 100644
--- a/doc/lispref/numbers.texi
+++ b/doc/lispref/numbers.texi
@@ -55,16 +55,13 @@ Integer Basics
The Lisp reader reads an integer as a nonempty sequence
of decimal digits with optional initial sign and optional
-final period. A decimal integer that is out of the
-Emacs range is treated as a floating-point number.
+final period.
@example
1 ; @r{The integer 1.}
1. ; @r{The integer 1.}
+1 ; @r{Also the integer 1.}
-1 ; @r{The integer @minus{}1.}
- 9000000000000000000
- ; @r{The floating-point number 9e18.}
0 ; @r{The integer 0.}
-0 ; @r{The integer 0.}
@end example
@@ -94,6 +91,15 @@ Integer Basics
#24r1k @result{} 44
@end example
+ If an integer is outside the Emacs range, the Lisp reader ordinarily
+signals an overflow. However, if a too-large plain integer ends in a
+period, the Lisp reader treats it as a floating-point number instead.
+This lets an Emacs Lisp program specify a large integer that is
+quietly approximated by a floating-point number on machines with
+limited word width. For example, @samp{536870912.} is a
+floating-point number if Emacs integers are only 30 bits wide and is
+an integer otherwise.
+
To understand how various functions work on integers, especially the
bitwise operators (@pxref{Bitwise Operations}), it is often helpful to
view the numbers in their binary form.
diff --git a/etc/NEWS b/etc/NEWS
index fd1d04b8a0..cb74f512cc 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -349,6 +349,13 @@ as new-style, bind the new variable 'force-new-style-backquotes' to t.
integer, Emacs now signals an error if the number is too large for the
implementation to format (Bug#30408).
++++
+** The Lisp reader now signals an overflow for plain decimal integers
+that do not end in '.' and are outside Emacs range. Formerly the Lisp
+reader silently converted them to floating-point numbers, and signaled
+overflow only for integers with a radix that are outside machine range
+(Bug#30408).
+
---
** Some functions and variables obsolete since Emacs 22 have been removed:
archive-mouse-extract, assoc-ignore-case, assoc-ignore-representation,
diff --git a/src/data.c b/src/data.c
index a7fab1ef58..6f23a26757 100644
--- a/src/data.c
+++ b/src/data.c
@@ -2716,9 +2716,7 @@ present, base 10 is used. BASE must be between 2 and 16 (inclusive).
If the base used is not 10, STRING is always parsed as an integer. */)
(register Lisp_Object string, Lisp_Object base)
{
- register char *p;
- register int b;
- Lisp_Object val;
+ int b;
CHECK_STRING (string);
@@ -2732,11 +2730,12 @@ If the base used is not 10, STRING is always parsed as an integer. */)
b = XINT (base);
}
- p = SSDATA (string);
+ char *p = SSDATA (string);
while (*p == ' ' || *p == '\t')
p++;
- val = string_to_number (p, b, true);
+ int flags = S2N_IGNORE_TRAILING | S2N_OVERFLOW_TO_FLOAT;
+ Lisp_Object val = string_to_number (p, b, flags);
return NILP (val) ? make_number (0) : val;
}
\f
diff --git a/src/lisp.h b/src/lisp.h
index f0c0c5a14a..b931d23bf3 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -3899,7 +3899,8 @@ LOADHIST_ATTACH (Lisp_Object x)
}
extern int openp (Lisp_Object, Lisp_Object, Lisp_Object,
Lisp_Object *, Lisp_Object, bool);
-extern Lisp_Object string_to_number (char const *, int, bool);
+enum { S2N_IGNORE_TRAILING = 1, S2N_OVERFLOW_TO_FLOAT = 2 };
+extern Lisp_Object string_to_number (char const *, int, int);
extern void map_obarray (Lisp_Object, void (*) (Lisp_Object, Lisp_Object),
Lisp_Object);
extern void dir_warning (const char *, Lisp_Object);
diff --git a/src/lread.c b/src/lread.c
index 381f9cf20c..a774524ee4 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -2339,7 +2339,7 @@ character_name_to_code (char const *name, ptrdiff_t name_len)
monstrosities like "U+-0000". */
Lisp_Object code
= (name[0] == 'U' && name[1] == '+'
- ? string_to_number (name + 1, 16, false)
+ ? string_to_number (name + 1, 16, 0)
: call2 (Qchar_from_name, make_unibyte_string (name, name_len), Qt));
if (! RANGED_INTEGERP (0, code, MAX_UNICODE_CHAR)
@@ -2693,7 +2693,7 @@ read_integer (Lisp_Object readcharfun, EMACS_INT radix)
invalid_syntax (buf);
}
- return string_to_number (buf, radix, false);
+ return string_to_number (buf, radix, 0);
}
@@ -3502,7 +3502,7 @@ read1 (Lisp_Object readcharfun, int *pch, bool first_in_list)
if (!quoted && !uninterned_symbol)
{
- Lisp_Object result = string_to_number (read_buffer, 10, false);
+ Lisp_Object result = string_to_number (read_buffer, 10, 0);
if (! NILP (result))
return unbind_to (count, result);
}
@@ -3667,16 +3667,17 @@ substitute_in_interval (INTERVAL interval, void *arg)
}
\f
-/* Convert STRING to a number, assuming base BASE. Return a fixnum if
- STRING has integer syntax and fits in a fixnum, else return the
- nearest float if STRING has either floating point or integer syntax
- and BASE is 10, else return nil. If IGNORE_TRAILING, consider just
- the longest prefix of STRING that has valid floating point syntax.
- Signal an overflow if BASE is not 10 and the number has integer
- syntax but does not fit. */
+/* Convert STRING to a number, assuming base BASE. When STRING has
+ floating point syntax and BASE is 10, return a nearest float. When
+ STRING has integer syntax, return a fixnum if the integer fits, and
+ signal an overflow otherwise (unless BASE is 10 and STRING ends in
+ period or FLAGS & S2N_OVERFLOW_TO_FLOAT is nonzero; in this case,
+ return a nearest float instead). Otherwise, return nil. If FLAGS
+ & S2N_IGNORE_TRAILING is nonzero, consider just the longest prefix
+ of STRING that has valid syntax. */
Lisp_Object
-string_to_number (char const *string, int base, bool ignore_trailing)
+string_to_number (char const *string, int base, int flags)
{
char const *cp = string;
bool float_syntax = 0;
@@ -3759,9 +3760,10 @@ string_to_number (char const *string, int base, bool ignore_trailing)
|| (state & ~INTOVERFLOW) == (LEAD_INT|E_EXP));
}
- /* Return nil if the number uses invalid syntax. If IGNORE_TRAILING, accept
- any prefix that matches. Otherwise, the entire string must match. */
- if (! (ignore_trailing
+ /* Return nil if the number uses invalid syntax. If FLAGS &
+ S2N_IGNORE_TRAILING, accept any prefix that matches. Otherwise,
+ the entire string must match. */
+ if (! (flags & S2N_IGNORE_TRAILING
? ((state & LEAD_INT) != 0 || float_syntax)
: (!*cp && ((state & ~(INTOVERFLOW | DOT_CHAR)) == LEAD_INT
|| float_syntax))))
@@ -3776,7 +3778,7 @@ string_to_number (char const *string, int base, bool ignore_trailing)
/* Unfortunately there's no simple and accurate way to convert
non-base-10 numbers that are out of C-language range. */
if (base != 10)
- xsignal1 (Qoverflow_error, build_string (string));
+ flags = 0;
}
else if (n <= (negative ? -MOST_NEGATIVE_FIXNUM : MOST_POSITIVE_FIXNUM))
{
@@ -3785,6 +3787,9 @@ string_to_number (char const *string, int base, bool ignore_trailing)
}
else
value = n;
+
+ if (! (state & DOT_CHAR) && ! (flags & S2N_OVERFLOW_TO_FLOAT))
+ xsignal1 (Qoverflow_error, build_string (string));
}
/* Either the number uses float syntax, or it does not fit into a fixnum.
diff --git a/src/process.c b/src/process.c
index 2aaa238f60..ed2cab7b51 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6842,7 +6842,7 @@ SIGCODE may be an integer, or a symbol whose name is a signal name. */)
{
Lisp_Object tem = Fget_process (process);
if (NILP (tem))
- tem = string_to_number (SSDATA (process), 10, true);
+ tem = string_to_number (SSDATA (process), 10, S2N_OVERFLOW_TO_FLOAT);
process = tem;
}
else if (!NUMBERP (process))
diff --git a/test/src/editfns-tests.el b/test/src/editfns-tests.el
index 6e1f730166..442ad08937 100644
--- a/test/src/editfns-tests.el
+++ b/test/src/editfns-tests.el
@@ -142,27 +142,41 @@ transpose-test-get-byte-positions
(should (string-equal (format "%#05X" #x10) "0X010"))
(should (string-equal (format "%#04x" 0) "0000")))
-;;; Test Bug#30408.
+
+;;; Tests for Bug#30408.
+
(ert-deftest format-%d-large-float ()
(should (string-equal (format "%d" 18446744073709551616.0)
"18446744073709551616"))
(should (string-equal (format "%d" -18446744073709551616.0)
"-18446744073709551616")))
-;;; Another test for Bug#30408.
;;; Perhaps Emacs will be improved someday to return the correct
;;; answer for positive numbers instead of overflowing; in
-;;; that case this test will need to be changed. In the meantime make
+;;; that case these tests will need to be changed. In the meantime make
;;; sure Emacs is reporting the overflow correctly.
(ert-deftest format-%x-large-float ()
(should-error (format "%x" 18446744073709551616.0)
:type 'overflow-error))
+(ert-deftest read-large-integer ()
+ (should-error (read (format "%d0" most-negative-fixnum))
+ :type 'overflow-error)
+ (should-error (read (format "%+d" (* -8.0 most-negative-fixnum)))
+ :type 'overflow-error)
+ (should-error (read (substring (format "%d" most-negative-fixnum) 1))
+ :type 'overflow-error)
+ (should-error (read (format "#x%x" most-negative-fixnum))
+ :type 'overflow-error)
+ (should-error (read (format "#o%o" most-negative-fixnum))
+ :type 'overflow-error)
+ (should-error (read (format "#32rG%x" most-positive-fixnum))
+ :type 'overflow-error))
-;;; Another test for Bug#30408.
(ert-deftest format-%o-invalid-float ()
(should-error (format "%o" -1e-37)
:type 'overflow-error))
+
;;; Check format-time-string with various TZ settings.
;;; Use only POSIX-compatible TZ values, since the tests should work
;;; even if tzdb is not in use.
--
2.14.3
next prev parent reply other threads:[~2018-03-27 23:19 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-18 1:27 Checking for loss of information on integer conversion Paul Eggert
[not found] ` <83y3jq9q4m.fsf@gnu.org>
2018-02-18 20:04 ` bug#30408: " Paul Eggert
2018-02-18 20:04 ` Paul Eggert
2018-02-18 20:24 ` bug#30408: " Eli Zaretskii
2018-02-18 20:24 ` Eli Zaretskii
2018-03-09 5:00 ` bug#30408: " Paul Eggert
2018-03-09 8:22 ` Eli Zaretskii
2018-03-21 19:13 ` Paul Eggert
2018-03-21 19:29 ` Eli Zaretskii
2018-02-18 21:52 ` Drew Adams
2018-03-27 23:19 ` Paul Eggert [this message]
2018-03-29 11:11 ` Eli Zaretskii
2018-03-29 18:09 ` Paul Eggert
2018-02-18 22:31 ` Juliusz Chroboczek
2018-02-18 22:41 ` Stefan Monnier
2018-02-18 23:46 ` Juliusz Chroboczek
2018-02-19 1:47 ` Stefan Monnier
2018-02-19 2:22 ` Paul Eggert
2018-02-19 3:20 ` Drew Adams
2018-02-19 15:05 ` Richard Stallman
2018-02-22 16:31 ` Juliusz Chroboczek
2018-02-22 17:01 ` Eli Zaretskii
2018-02-22 19:31 ` Stefan Monnier
2018-02-23 9:49 ` Richard Stallman
2018-02-19 6:03 ` John Wiegley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f1e3e23f-7ac2-9840-403b-e4d76f1ed83c@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=30408@debbugs.gnu.org \
--cc=eliz@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.