From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#30408: 24.5; (format "%x" large-number) produces incorrect results Date: Sat, 17 Feb 2018 17:08:22 -0800 Organization: UCLA Computer Science Department Message-ID: <6e24717b-8fad-b9bf-0c92-d3c9d958bfc6@cs.ucla.edu> References: <60284cf8-d1b4-a1c6-5d06-a21a7085c89c@cs.ucla.edu> <856eb7d4-ed47-4a5c-8747-6334ab37638a@default> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------AC1E2F59C65988B6D17F2792" X-Trace: blaine.gmane.org 1518916053 4336 195.159.176.226 (18 Feb 2018 01:07:33 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 18 Feb 2018 01:07:33 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 Cc: 30408@debbugs.gnu.org To: Drew Adams , David Sitsky Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Feb 18 02:07:28 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1enDRb-0007yP-RE for geb-bug-gnu-emacs@m.gmane.org; Sun, 18 Feb 2018 02:07:08 +0100 Original-Received: from localhost ([::1]:58177 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1enDTd-0002OW-RW for geb-bug-gnu-emacs@m.gmane.org; Sat, 17 Feb 2018 20:09:13 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:59076) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1enDTV-0002MD-Kd for bug-gnu-emacs@gnu.org; Sat, 17 Feb 2018 20:09:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1enDTS-0001Ny-DF for bug-gnu-emacs@gnu.org; Sat, 17 Feb 2018 20:09:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:40884) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1enDTS-0001Nu-6z for bug-gnu-emacs@gnu.org; Sat, 17 Feb 2018 20:09:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1enDTS-0004kH-1N for bug-gnu-emacs@gnu.org; Sat, 17 Feb 2018 20:09:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 18 Feb 2018 01:09:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 30408 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 30408-submit@debbugs.gnu.org id=B30408.151891611318199 (code B ref 30408); Sun, 18 Feb 2018 01:09:01 +0000 Original-Received: (at 30408) by debbugs.gnu.org; 18 Feb 2018 01:08:33 +0000 Original-Received: from localhost ([127.0.0.1]:48781 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1enDSy-0004jQ-Gv for submit@debbugs.gnu.org; Sat, 17 Feb 2018 20:08:32 -0500 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:40694) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1enDSw-0004jB-Hv for 30408@debbugs.gnu.org; Sat, 17 Feb 2018 20:08:31 -0500 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id A7BC11615B2; Sat, 17 Feb 2018 17:08:24 -0800 (PST) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id Sjz7YqbkZIjw; Sat, 17 Feb 2018 17:08:23 -0800 (PST) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 6B14E1615E5; Sat, 17 Feb 2018 17:08:23 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id k6-E-f_T47eq; Sat, 17 Feb 2018 17:08:23 -0800 (PST) Original-Received: from [192.168.1.9] (unknown [47.154.30.119]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 2265B1615B2; Sat, 17 Feb 2018 17:08:23 -0800 (PST) In-Reply-To: <856eb7d4-ed47-4a5c-8747-6334ab37638a@default> Content-Language: en-US X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:143405 Archived-At: This is a multi-part message in MIME format. --------------AC1E2F59C65988B6D17F2792 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable This kind of bug has bitten me before, so I think it's worthwhile for Ema= cs to=20 defend against it better. Proposed patch attached. Although this patch do= esn't=20 address the major problem here (which is that Emacs lacks bignums), it do= es=20 cause Emacs to respond better to large numbers, by not losing information= when=20 it is reading or printing integers. With this patch, one cannot evaluate (format "%x" 2738188573457603759) be= cause=20 the Lisp reader signals an error when it sees the unrepresentable integer= =20 2738188573457603759, instead of silently substituting a different number.= =20 Another example: (format "%d" 18446744073709551616) now returns=20 "18446744073709551616" instead of the quite-wrong "9223372036854775807". --------------AC1E2F59C65988B6D17F2792 Content-Type: text/x-patch; name="0001-Avoid-losing-info-when-converting-integers.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0001-Avoid-losing-info-when-converting-integers.patch" =46rom e1865be990e1a520feddc07507a71916d097d633 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 17 Feb 2018 16:45:17 -0800 Subject: [PATCH] Avoid losing info when converting integers This fixes some glitches with large integers (Bug#30408). * doc/lispref/numbers.texi (Integer Basics): Say that decimal integers out of fixnum range must be representable exactly as floating-point. * etc/NEWS: Mention this. * src/data.c (syms_of_data): Add Qinexact_error. * src/editfns.c (styled_format): Use %.0f when formatting %d or %i values outside machine integer range, to avoid losing info. Signal an error for %o or %x values that are too large to be formatted, to avoid losing info. * src/lread.c (string_to_number): When converting an integer-format string to floating-point, signal an error if info is lost. --- doc/lispref/numbers.texi | 8 +++-- etc/NEWS | 9 +++++ src/data.c | 1 + src/editfns.c | 93 ++++++++++++++++++++----------------------= ------ src/lread.c | 14 ++++++++ 5 files changed, 67 insertions(+), 58 deletions(-) diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi index e692ee1cc2..252aafd8fd 100644 --- a/doc/lispref/numbers.texi +++ b/doc/lispref/numbers.texi @@ -53,9 +53,11 @@ Integer Basics chapter assume the minimum integer width of 30 bits. @cindex overflow =20 - The Lisp reader reads an integer as a sequence of digits with optional= -initial sign and optional final period. An integer that is out of the -Emacs range is treated as a floating-point number. + The Lisp reader can read an integer as a nonempty sequence of +decimal digits with optional initial sign and optional final period. +A decimal integer that is out of the Emacs range is treated as +floating-point if it can be represented exactly as a floating-point +number. =20 @example 1 ; @r{The integer 1.} diff --git a/etc/NEWS b/etc/NEWS index 8db638e5ed..36cbcf6500 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -248,6 +248,12 @@ as new-style, bind the new variable 'force-new-style= -backquotes' to t. 'cl-struct-define' whose name clashes with a builtin type (e.g., 'integer' or 'hash-table') now signals an error. =20 +** When formatting a floating-point number as an octal or hexadecimal +integer, Emacs now signals an error if the number is too large for the +implementation to format. When reading an integer outside Emacs +fixnum range, Emacs now signals an error if the integer cannot be +represented exactly as a floating-point number. See Bug#30408. + =0C * Lisp Changes in Emacs 27.1 =20 @@ -289,6 +295,9 @@ remote systems, which support this check. If the optional third argument is non-nil, 'make-string' will produce a multibyte string even if its second argument is an ASCII character. =20 +** (format "%d" X) no longer mishandles floating-point X values that +do not fit in a machine integer (Bug#30408). + ** New JSON parsing and serialization functions 'json-serialize', 'json-insert', 'json-parse-string', and 'json-parse-buffer'. These are implemented in C using the Jansson library. diff --git a/src/data.c b/src/data.c index 72abfefb01..8856583f13 100644 --- a/src/data.c +++ b/src/data.c @@ -3729,6 +3729,7 @@ syms_of_data (void) DEFSYM (Qrange_error, "range-error"); DEFSYM (Qdomain_error, "domain-error"); DEFSYM (Qsingularity_error, "singularity-error"); + DEFSYM (Qinexact_error, "inexact-error"); DEFSYM (Qoverflow_error, "overflow-error"); DEFSYM (Qunderflow_error, "underflow-error"); =20 diff --git a/src/editfns.c b/src/editfns.c index 96bb271b2d..d26549ddb8 100644 --- a/src/editfns.c +++ b/src/editfns.c @@ -4563,32 +4563,30 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args= , bool message) and with pM inserted for integer formats. At most two flags F can be specified at once. */ char convspec[sizeof "%FF.*d" + max (INT_AS_LDBL, pMlen)]; - { - char *f =3D convspec; - *f++ =3D '%'; - /* MINUS_FLAG and ZERO_FLAG are dealt with later. */ - *f =3D '+'; f +=3D plus_flag; - *f =3D ' '; f +=3D space_flag; - *f =3D '#'; f +=3D sharp_flag; - *f++ =3D '.'; - *f++ =3D '*'; - if (float_conversion) - { - if (INT_AS_LDBL) - { - *f =3D 'L'; - f +=3D INTEGERP (arg); - } - } - else if (conversion !=3D 'c') - { - memcpy (f, pMd, pMlen); - f +=3D pMlen; - zero_flag &=3D ! precision_given; - } - *f++ =3D conversion; - *f =3D '\0'; - } + char *f =3D convspec; + *f++ =3D '%'; + /* MINUS_FLAG and ZERO_FLAG are dealt with later. */ + *f =3D '+'; f +=3D plus_flag; + *f =3D ' '; f +=3D space_flag; + *f =3D '#'; f +=3D sharp_flag; + *f++ =3D '.'; + *f++ =3D '*'; + if (float_conversion) + { + if (INT_AS_LDBL) + { + *f =3D 'L'; + f +=3D INTEGERP (arg); + } + } + else if (conversion !=3D 'c') + { + memcpy (f, pMd, pMlen); + f +=3D pMlen; + zero_flag &=3D ! precision_given; + } + *f++ =3D conversion; + *f =3D '\0'; =20 int prec =3D -1; if (precision_given) @@ -4630,29 +4628,18 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args= , bool message) } else if (conversion =3D=3D 'd' || conversion =3D=3D 'i') { - /* For float, maybe we should use "%1.0f" - instead so it also works for values outside - the integer range. */ - printmax_t x; if (INTEGERP (arg)) - x =3D XINT (arg); + { + printmax_t x =3D XINT (arg); + sprintf_bytes =3D sprintf (sprintf_buf, convspec, prec, x); + } else { - double d =3D XFLOAT_DATA (arg); - if (d < 0) - { - x =3D TYPE_MINIMUM (printmax_t); - if (x < d) - x =3D d; - } - else - { - x =3D TYPE_MAXIMUM (printmax_t); - if (d < x) - x =3D d; - } + strcpy (f - pMlen - 1, "f"); + prec =3D 0; + double x =3D XFLOAT_DATA (arg); + sprintf_bytes =3D sprintf (sprintf_buf, convspec, prec, x); } - sprintf_bytes =3D sprintf (sprintf_buf, convspec, prec, x); } else { @@ -4663,22 +4650,18 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args= , bool message) else { double d =3D XFLOAT_DATA (arg); - if (d < 0) - x =3D 0; - else - { - x =3D TYPE_MAXIMUM (uprintmax_t); - if (d < x) - x =3D d; - } + if (! (0 <=3D d && d < TYPE_MAXIMUM (uprintmax_t))) + xsignal1 (Qoverflow_error, arg); + x =3D d; } sprintf_bytes =3D sprintf (sprintf_buf, convspec, prec, x); } =20 /* Now the length of the formatted item is known, except it omits= padding and excess precision. Deal with excess precision - first. This happens only when the format specifies - ridiculously large precision. */ + first. This happens when the format specifies + ridiculously large precision, or when %d or %i has + nonzero precision and formats a float. */ ptrdiff_t excess_precision =3D precision_given ? precision - prec : 0; ptrdiff_t leading_zeros =3D 0, trailing_zeros =3D 0; diff --git a/src/lread.c b/src/lread.c index d009bd0cd2..cfeaac8030 100644 --- a/src/lread.c +++ b/src/lread.c @@ -3794,6 +3794,20 @@ string_to_number (char const *string, int base, bo= ol ignore_trailing) if (! value) value =3D atof (string + signedp); =20 + if (! float_syntax) + { + /* Check that converting the integer-format STRING to a + floating-point number does not lose info. See Bug#30408. */ + char const *bp =3D string + signedp; + while (*bp =3D=3D '0') + bp++; + char checkbuf[DBL_MAX_10_EXP + 2]; + int checkbuflen =3D sprintf (checkbuf, "%.0f", value); + if (! (cp - bp - !!(state & DOT_CHAR) =3D=3D checkbuflen + && memcmp (bp, checkbuf, checkbuflen) =3D=3D 0)) + xsignal1 (Qinexact_error, build_string (string)); + } + return make_float (negative ? -value : value); } =20 --=20 2.14.3 --------------AC1E2F59C65988B6D17F2792--