From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#30408: Checking for loss of information on integer conversion Date: Tue, 27 Mar 2018 16:19:21 -0700 Organization: UCLA Computer Science Department Message-ID: References: <7432641a-cedc-942c-d75c-0320fce5ba39@cs.ucla.edu> <83y3jq9q4m.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------C8997759E81B0890A121F27F" X-Trace: blaine.gmane.org 1522192695 23229 195.159.176.226 (27 Mar 2018 23:18:15 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 27 Mar 2018 23:18:15 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 To: Eli Zaretskii , 30408@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Mar 28 01:18:11 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1f0xqy-0005uH-2Y for geb-bug-gnu-emacs@m.gmane.org; Wed, 28 Mar 2018 01:18:08 +0200 Original-Received: from localhost ([::1]:36517 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0xt1-00034X-FK for geb-bug-gnu-emacs@m.gmane.org; Tue, 27 Mar 2018 19:20:15 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39808) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0xst-0002jn-P7 for bug-gnu-emacs@gnu.org; Tue, 27 Mar 2018 19:20:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0xso-0001Eq-Oi for bug-gnu-emacs@gnu.org; Tue, 27 Mar 2018 19:20:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:48472) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1f0xso-0001Em-JS for bug-gnu-emacs@gnu.org; Tue, 27 Mar 2018 19:20:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1f0xso-0007O0-BU for bug-gnu-emacs@gnu.org; Tue, 27 Mar 2018 19:20:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 27 Mar 2018 23:20:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 30408 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 30408-submit@debbugs.gnu.org id=B30408.152219277328348 (code B ref 30408); Tue, 27 Mar 2018 23:20:02 +0000 Original-Received: (at 30408) by debbugs.gnu.org; 27 Mar 2018 23:19:33 +0000 Original-Received: from localhost ([127.0.0.1]:56369 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1f0xsK-0007NA-Ff for submit@debbugs.gnu.org; Tue, 27 Mar 2018 19:19:32 -0400 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:36094) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1f0xsH-0007Mw-Nz for 30408@debbugs.gnu.org; Tue, 27 Mar 2018 19:19:30 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 9DF84161446; Tue, 27 Mar 2018 16:19:23 -0700 (PDT) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id LVCKYMWmqoys; Tue, 27 Mar 2018 16:19:22 -0700 (PDT) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 3ADFB1614F9; Tue, 27 Mar 2018 16:19:22 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id n-3JNygiUATs; Tue, 27 Mar 2018 16:19:22 -0700 (PDT) Original-Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 1934D161446; Tue, 27 Mar 2018 16:19:22 -0700 (PDT) In-Reply-To: <83y3jq9q4m.fsf@gnu.org> Content-Language: en-US X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:144663 Archived-At: This is a multi-part message in MIME format. --------------C8997759E81B0890A121F27F Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Here's a patch that I hope addresses the main problem. The basic idea is to avoid the confusion exemplified in Bug#30408 by changing Emacs so that it ordinarily signals an error if it reads a program that contains an integer literal that is out of fixnum range. However, if the out-of-range literal is followed by '.' then Emacs continues to silently convert it to floating-point; this is intended as an escape hatch for any programs that need the old behavior (I expect this'll be rare). Thus, on 32-bit Emacs, plain '536870912' in a program causes Emacs to signal an overflow while loading the program, whereas '536870912.' is treated as a floating-point number as before. (On 64-bit Emacs, the same two literals are both integers, as before.) Unlike my previous proposal, this patch does not affect the behavior of string-to-integer. As I understand it, that was a primary source of qualms about the previous proposal. I've tested this on both 32- and 64-bit Emacs on master. This patch has helped me to find a couple of integer portability bugs which I already fixed on master. --------------C8997759E81B0890A121F27F Content-Type: text/x-patch; name="0001-Lisp-reader-now-checks-for-integer-overflow.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-Lisp-reader-now-checks-for-integer-overflow.patch" >From 94b7a1a171de3113cd5250315dee7bdef5f51890 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Tue, 27 Mar 2018 15:48:47 -0700 Subject: [PATCH] Lisp reader now checks for integer overflow * doc/lispref/numbers.texi (Integer Basics), etc/NEWS: Document this. * src/lisp.h (S2N_IGNORE_TRAILING, S2N_OVERFLOW_TO_FLOAT): New constants. * src/lread.c (string_to_number): Change trailing bool arg to integer argument with flags, to support S2N_OVERFLOW_TO_FLOAT. All uses changed. * test/src/editfns-tests.el (read-large-integer): New test. --- doc/lispref/numbers.texi | 14 ++++++++++---- etc/NEWS | 7 +++++++ src/data.c | 9 ++++----- src/lisp.h | 3 ++- src/lread.c | 35 ++++++++++++++++++++--------------- src/process.c | 2 +- test/src/editfns-tests.el | 22 ++++++++++++++++++---- 7 files changed, 62 insertions(+), 30 deletions(-) diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi index c2cb6651d4..2fed2b642f 100644 --- a/doc/lispref/numbers.texi +++ b/doc/lispref/numbers.texi @@ -55,16 +55,13 @@ Integer Basics The Lisp reader reads an integer as a nonempty sequence of decimal digits with optional initial sign and optional -final period. A decimal integer that is out of the -Emacs range is treated as a floating-point number. +final period. @example 1 ; @r{The integer 1.} 1. ; @r{The integer 1.} +1 ; @r{Also the integer 1.} -1 ; @r{The integer @minus{}1.} - 9000000000000000000 - ; @r{The floating-point number 9e18.} 0 ; @r{The integer 0.} -0 ; @r{The integer 0.} @end example @@ -94,6 +91,15 @@ Integer Basics #24r1k @result{} 44 @end example + If an integer is outside the Emacs range, the Lisp reader ordinarily +signals an overflow. However, if a too-large plain integer ends in a +period, the Lisp reader treats it as a floating-point number instead. +This lets an Emacs Lisp program specify a large integer that is +quietly approximated by a floating-point number on machines with +limited word width. For example, @samp{536870912.} is a +floating-point number if Emacs integers are only 30 bits wide and is +an integer otherwise. + To understand how various functions work on integers, especially the bitwise operators (@pxref{Bitwise Operations}), it is often helpful to view the numbers in their binary form. diff --git a/etc/NEWS b/etc/NEWS index fd1d04b8a0..cb74f512cc 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -349,6 +349,13 @@ as new-style, bind the new variable 'force-new-style-backquotes' to t. integer, Emacs now signals an error if the number is too large for the implementation to format (Bug#30408). ++++ +** The Lisp reader now signals an overflow for plain decimal integers +that do not end in '.' and are outside Emacs range. Formerly the Lisp +reader silently converted them to floating-point numbers, and signaled +overflow only for integers with a radix that are outside machine range +(Bug#30408). + --- ** Some functions and variables obsolete since Emacs 22 have been removed: archive-mouse-extract, assoc-ignore-case, assoc-ignore-representation, diff --git a/src/data.c b/src/data.c index a7fab1ef58..6f23a26757 100644 --- a/src/data.c +++ b/src/data.c @@ -2716,9 +2716,7 @@ present, base 10 is used. BASE must be between 2 and 16 (inclusive). If the base used is not 10, STRING is always parsed as an integer. */) (register Lisp_Object string, Lisp_Object base) { - register char *p; - register int b; - Lisp_Object val; + int b; CHECK_STRING (string); @@ -2732,11 +2730,12 @@ If the base used is not 10, STRING is always parsed as an integer. */) b = XINT (base); } - p = SSDATA (string); + char *p = SSDATA (string); while (*p == ' ' || *p == '\t') p++; - val = string_to_number (p, b, true); + int flags = S2N_IGNORE_TRAILING | S2N_OVERFLOW_TO_FLOAT; + Lisp_Object val = string_to_number (p, b, flags); return NILP (val) ? make_number (0) : val; } diff --git a/src/lisp.h b/src/lisp.h index f0c0c5a14a..b931d23bf3 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -3899,7 +3899,8 @@ LOADHIST_ATTACH (Lisp_Object x) } extern int openp (Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object *, Lisp_Object, bool); -extern Lisp_Object string_to_number (char const *, int, bool); +enum { S2N_IGNORE_TRAILING = 1, S2N_OVERFLOW_TO_FLOAT = 2 }; +extern Lisp_Object string_to_number (char const *, int, int); extern void map_obarray (Lisp_Object, void (*) (Lisp_Object, Lisp_Object), Lisp_Object); extern void dir_warning (const char *, Lisp_Object); diff --git a/src/lread.c b/src/lread.c index 381f9cf20c..a774524ee4 100644 --- a/src/lread.c +++ b/src/lread.c @@ -2339,7 +2339,7 @@ character_name_to_code (char const *name, ptrdiff_t name_len) monstrosities like "U+-0000". */ Lisp_Object code = (name[0] == 'U' && name[1] == '+' - ? string_to_number (name + 1, 16, false) + ? string_to_number (name + 1, 16, 0) : call2 (Qchar_from_name, make_unibyte_string (name, name_len), Qt)); if (! RANGED_INTEGERP (0, code, MAX_UNICODE_CHAR) @@ -2693,7 +2693,7 @@ read_integer (Lisp_Object readcharfun, EMACS_INT radix) invalid_syntax (buf); } - return string_to_number (buf, radix, false); + return string_to_number (buf, radix, 0); } @@ -3502,7 +3502,7 @@ read1 (Lisp_Object readcharfun, int *pch, bool first_in_list) if (!quoted && !uninterned_symbol) { - Lisp_Object result = string_to_number (read_buffer, 10, false); + Lisp_Object result = string_to_number (read_buffer, 10, 0); if (! NILP (result)) return unbind_to (count, result); } @@ -3667,16 +3667,17 @@ substitute_in_interval (INTERVAL interval, void *arg) } -/* Convert STRING to a number, assuming base BASE. Return a fixnum if - STRING has integer syntax and fits in a fixnum, else return the - nearest float if STRING has either floating point or integer syntax - and BASE is 10, else return nil. If IGNORE_TRAILING, consider just - the longest prefix of STRING that has valid floating point syntax. - Signal an overflow if BASE is not 10 and the number has integer - syntax but does not fit. */ +/* Convert STRING to a number, assuming base BASE. When STRING has + floating point syntax and BASE is 10, return a nearest float. When + STRING has integer syntax, return a fixnum if the integer fits, and + signal an overflow otherwise (unless BASE is 10 and STRING ends in + period or FLAGS & S2N_OVERFLOW_TO_FLOAT is nonzero; in this case, + return a nearest float instead). Otherwise, return nil. If FLAGS + & S2N_IGNORE_TRAILING is nonzero, consider just the longest prefix + of STRING that has valid syntax. */ Lisp_Object -string_to_number (char const *string, int base, bool ignore_trailing) +string_to_number (char const *string, int base, int flags) { char const *cp = string; bool float_syntax = 0; @@ -3759,9 +3760,10 @@ string_to_number (char const *string, int base, bool ignore_trailing) || (state & ~INTOVERFLOW) == (LEAD_INT|E_EXP)); } - /* Return nil if the number uses invalid syntax. If IGNORE_TRAILING, accept - any prefix that matches. Otherwise, the entire string must match. */ - if (! (ignore_trailing + /* Return nil if the number uses invalid syntax. If FLAGS & + S2N_IGNORE_TRAILING, accept any prefix that matches. Otherwise, + the entire string must match. */ + if (! (flags & S2N_IGNORE_TRAILING ? ((state & LEAD_INT) != 0 || float_syntax) : (!*cp && ((state & ~(INTOVERFLOW | DOT_CHAR)) == LEAD_INT || float_syntax)))) @@ -3776,7 +3778,7 @@ string_to_number (char const *string, int base, bool ignore_trailing) /* Unfortunately there's no simple and accurate way to convert non-base-10 numbers that are out of C-language range. */ if (base != 10) - xsignal1 (Qoverflow_error, build_string (string)); + flags = 0; } else if (n <= (negative ? -MOST_NEGATIVE_FIXNUM : MOST_POSITIVE_FIXNUM)) { @@ -3785,6 +3787,9 @@ string_to_number (char const *string, int base, bool ignore_trailing) } else value = n; + + if (! (state & DOT_CHAR) && ! (flags & S2N_OVERFLOW_TO_FLOAT)) + xsignal1 (Qoverflow_error, build_string (string)); } /* Either the number uses float syntax, or it does not fit into a fixnum. diff --git a/src/process.c b/src/process.c index 2aaa238f60..ed2cab7b51 100644 --- a/src/process.c +++ b/src/process.c @@ -6842,7 +6842,7 @@ SIGCODE may be an integer, or a symbol whose name is a signal name. */) { Lisp_Object tem = Fget_process (process); if (NILP (tem)) - tem = string_to_number (SSDATA (process), 10, true); + tem = string_to_number (SSDATA (process), 10, S2N_OVERFLOW_TO_FLOAT); process = tem; } else if (!NUMBERP (process)) diff --git a/test/src/editfns-tests.el b/test/src/editfns-tests.el index 6e1f730166..442ad08937 100644 --- a/test/src/editfns-tests.el +++ b/test/src/editfns-tests.el @@ -142,27 +142,41 @@ transpose-test-get-byte-positions (should (string-equal (format "%#05X" #x10) "0X010")) (should (string-equal (format "%#04x" 0) "0000"))) -;;; Test Bug#30408. + +;;; Tests for Bug#30408. + (ert-deftest format-%d-large-float () (should (string-equal (format "%d" 18446744073709551616.0) "18446744073709551616")) (should (string-equal (format "%d" -18446744073709551616.0) "-18446744073709551616"))) -;;; Another test for Bug#30408. ;;; Perhaps Emacs will be improved someday to return the correct ;;; answer for positive numbers instead of overflowing; in -;;; that case this test will need to be changed. In the meantime make +;;; that case these tests will need to be changed. In the meantime make ;;; sure Emacs is reporting the overflow correctly. (ert-deftest format-%x-large-float () (should-error (format "%x" 18446744073709551616.0) :type 'overflow-error)) +(ert-deftest read-large-integer () + (should-error (read (format "%d0" most-negative-fixnum)) + :type 'overflow-error) + (should-error (read (format "%+d" (* -8.0 most-negative-fixnum))) + :type 'overflow-error) + (should-error (read (substring (format "%d" most-negative-fixnum) 1)) + :type 'overflow-error) + (should-error (read (format "#x%x" most-negative-fixnum)) + :type 'overflow-error) + (should-error (read (format "#o%o" most-negative-fixnum)) + :type 'overflow-error) + (should-error (read (format "#32rG%x" most-positive-fixnum)) + :type 'overflow-error)) -;;; Another test for Bug#30408. (ert-deftest format-%o-invalid-float () (should-error (format "%o" -1e-37) :type 'overflow-error)) + ;;; Check format-time-string with various TZ settings. ;;; Use only POSIX-compatible TZ values, since the tests should work ;;; even if tzdb is not in use. -- 2.14.3 --------------C8997759E81B0890A121F27F--