From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#30408: Checking for loss of information on integer conversion Date: Thu, 8 Mar 2018 21:00:42 -0800 Organization: UCLA Computer Science Department Message-ID: <66f719bf-2527-a213-5b8e-18044963f30e@cs.ucla.edu> References: <7432641a-cedc-942c-d75c-0320fce5ba39@cs.ucla.edu> <83y3jq9q4m.fsf@gnu.org> <74ac7b77-a756-95a9-b490-6952cf106f21@cs.ucla.edu> <83fu5y9hbx.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------000F536DECDF2BF7DE93EA82" X-Trace: blaine.gmane.org 1520571553 14374 195.159.176.226 (9 Mar 2018 04:59:13 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 9 Mar 2018 04:59:13 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 Cc: 30408@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Mar 09 05:59:09 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1euA7X-0003dl-ML for geb-bug-gnu-emacs@m.gmane.org; Fri, 09 Mar 2018 05:59:07 +0100 Original-Received: from localhost ([::1]:43274 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1euA9a-00015V-2T for geb-bug-gnu-emacs@m.gmane.org; Fri, 09 Mar 2018 00:01:14 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:35070) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1euA9S-00013r-8N for bug-gnu-emacs@gnu.org; Fri, 09 Mar 2018 00:01:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1euA9P-0006gG-2d for bug-gnu-emacs@gnu.org; Fri, 09 Mar 2018 00:01:06 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:43774) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1euA9O-0006gA-UJ for bug-gnu-emacs@gnu.org; Fri, 09 Mar 2018 00:01:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1euA9O-0005sK-LZ for bug-gnu-emacs@gnu.org; Fri, 09 Mar 2018 00:01:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 09 Mar 2018 05:01:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 30408 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 30408-submit@debbugs.gnu.org id=B30408.152057165722565 (code B ref 30408); Fri, 09 Mar 2018 05:01:02 +0000 Original-Received: (at 30408) by debbugs.gnu.org; 9 Mar 2018 05:00:57 +0000 Original-Received: from localhost ([127.0.0.1]:51671 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1euA9J-0005rt-BK for submit@debbugs.gnu.org; Fri, 09 Mar 2018 00:00:57 -0500 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:50470) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1euA9H-0005rf-3b for 30408@debbugs.gnu.org; Fri, 09 Mar 2018 00:00:55 -0500 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 24B301615E8; Thu, 8 Mar 2018 21:00:49 -0800 (PST) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 37liVTyrcmfR; Thu, 8 Mar 2018 21:00:48 -0800 (PST) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 0B16A1615FB; Thu, 8 Mar 2018 21:00:48 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id EM7unIjLWkPb; Thu, 8 Mar 2018 21:00:47 -0800 (PST) Original-Received: from [192.168.1.9] (unknown [47.154.30.119]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id D82E11615E8; Thu, 8 Mar 2018 21:00:47 -0800 (PST) In-Reply-To: <83fu5y9hbx.fsf@gnu.org> Content-Language: en-US X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:144063 Archived-At: This is a multi-part message in MIME format. --------------000F536DECDF2BF7DE93EA82 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Since the qualms expressed on this topic had to do with converting strings to integers, I installed into master the noncontroversial part affecting conversion of integers to strings (see attached patch; it also fixes some minor glitches in the previous proposal). I'll think about the string-to-integer conversion a bit more and propose an updated patch for that. --------------000F536DECDF2BF7DE93EA82 Content-Type: text/x-patch; name="0001-Avoid-losing-info-when-formatting-integers.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-Avoid-losing-info-when-formatting-integers.patch" >From 80e145fc96765cc0a0f48ae2425294c8c92bce56 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Thu, 8 Mar 2018 20:55:55 -0800 Subject: [PATCH] Avoid losing info when formatting integers * doc/lispref/numbers.texi (Integer Basics): Clarify that out-of-range integers are treated as floating point only when the integers are decimal. * etc/NEWS: Mention changes. * src/editfns.c (styled_format): Use %.0f when formatting %d or %i values outside machine integer range, to avoid losing info. Signal an error for %o or %x values that are too large to be formatted, to avoid losing info. --- doc/lispref/numbers.texi | 5 ++- etc/NEWS | 7 ++++ src/editfns.c | 96 +++++++++++++++++++++--------------------------- 3 files changed, 51 insertions(+), 57 deletions(-) diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi index e692ee1..f1180cf 100644 --- a/doc/lispref/numbers.texi +++ b/doc/lispref/numbers.texi @@ -53,8 +53,9 @@ Integer Basics chapter assume the minimum integer width of 30 bits. @cindex overflow - The Lisp reader reads an integer as a sequence of digits with optional -initial sign and optional final period. An integer that is out of the + The Lisp reader reads an integer as a nonempty sequence +of decimal digits with optional initial sign and optional +final period. A decimal integer that is out of the Emacs range is treated as a floating-point number. @example diff --git a/etc/NEWS b/etc/NEWS index 07f6d04..14926ba 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -302,6 +302,10 @@ as new-style, bind the new variable 'force-new-style-backquotes' to t. 'cl-struct-define' whose name clashes with a builtin type (e.g., 'integer' or 'hash-table') now signals an error. +** When formatting a floating-point number as an octal or hexadecimal +integer, Emacs now signals an error if the number is too large for the +implementation to format (Bug#30408). + * Lisp Changes in Emacs 27.1 @@ -343,6 +347,9 @@ remote systems, which support this check. If the optional third argument is non-nil, 'make-string' will produce a multibyte string even if its second argument is an ASCII character. +** (format "%d" X) no longer mishandles a floating-point number X that +does not fit in a machine integer (Bug#30408). + ** New JSON parsing and serialization functions 'json-serialize', 'json-insert', 'json-parse-string', and 'json-parse-buffer'. These are implemented in C using the Jansson library. diff --git a/src/editfns.c b/src/editfns.c index 96bb271..3a34dd0 100644 --- a/src/editfns.c +++ b/src/editfns.c @@ -4563,32 +4563,30 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) and with pM inserted for integer formats. At most two flags F can be specified at once. */ char convspec[sizeof "%FF.*d" + max (INT_AS_LDBL, pMlen)]; - { - char *f = convspec; - *f++ = '%'; - /* MINUS_FLAG and ZERO_FLAG are dealt with later. */ - *f = '+'; f += plus_flag; - *f = ' '; f += space_flag; - *f = '#'; f += sharp_flag; - *f++ = '.'; - *f++ = '*'; - if (float_conversion) - { - if (INT_AS_LDBL) - { - *f = 'L'; - f += INTEGERP (arg); - } - } - else if (conversion != 'c') - { - memcpy (f, pMd, pMlen); - f += pMlen; - zero_flag &= ! precision_given; - } - *f++ = conversion; - *f = '\0'; - } + char *f = convspec; + *f++ = '%'; + /* MINUS_FLAG and ZERO_FLAG are dealt with later. */ + *f = '+'; f += plus_flag; + *f = ' '; f += space_flag; + *f = '#'; f += sharp_flag; + *f++ = '.'; + *f++ = '*'; + if (float_conversion) + { + if (INT_AS_LDBL) + { + *f = 'L'; + f += INTEGERP (arg); + } + } + else if (conversion != 'c') + { + memcpy (f, pMd, pMlen); + f += pMlen; + zero_flag &= ! precision_given; + } + *f++ = conversion; + *f = '\0'; int prec = -1; if (precision_given) @@ -4630,29 +4628,20 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) } else if (conversion == 'd' || conversion == 'i') { - /* For float, maybe we should use "%1.0f" - instead so it also works for values outside - the integer range. */ - printmax_t x; if (INTEGERP (arg)) - x = XINT (arg); + { + printmax_t x = XINT (arg); + sprintf_bytes = sprintf (sprintf_buf, convspec, prec, x); + } else { - double d = XFLOAT_DATA (arg); - if (d < 0) - { - x = TYPE_MINIMUM (printmax_t); - if (x < d) - x = d; - } - else - { - x = TYPE_MAXIMUM (printmax_t); - if (d < x) - x = d; - } + strcpy (f - pMlen - 1, "f"); + double x = XFLOAT_DATA (arg); + sprintf_bytes = sprintf (sprintf_buf, convspec, 0, x); + char c0 = sprintf_buf[0]; + bool signedp = ! ('0' <= c0 && c0 <= '9'); + prec = min (precision, sprintf_bytes - signedp); } - sprintf_bytes = sprintf (sprintf_buf, convspec, prec, x); } else { @@ -4663,22 +4652,19 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) else { double d = XFLOAT_DATA (arg); - if (d < 0) - x = 0; - else - { - x = TYPE_MAXIMUM (uprintmax_t); - if (d < x) - x = d; - } + double uprintmax = TYPE_MAXIMUM (uprintmax_t); + if (! (0 <= d && d < uprintmax + 1)) + xsignal1 (Qoverflow_error, arg); + x = d; } sprintf_bytes = sprintf (sprintf_buf, convspec, prec, x); } /* Now the length of the formatted item is known, except it omits padding and excess precision. Deal with excess precision - first. This happens only when the format specifies - ridiculously large precision. */ + first. This happens when the format specifies ridiculously + large precision, or when %d or %i formats a float that would + ordinarily need fewer digits than a specified precision. */ ptrdiff_t excess_precision = precision_given ? precision - prec : 0; ptrdiff_t leading_zeros = 0, trailing_zeros = 0; -- 2.7.4 --------------000F536DECDF2BF7DE93EA82--