bug#30408: 24.5; (format "%x" large-number) produces incorrect results

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#30408: 24.5; (format "%x" large-number) produces incorrect results
@ 2018-02-10  6:22 David Sitsky
  2018-02-12  2:49 ` Paul Eggert
  2018-05-06  6:29 ` bug#30408: More context in `read-integer-overflow-as-float'? Bastien
  0 siblings, 2 replies; 5+ messages in thread
From: David Sitsky @ 2018-02-10  6:22 UTC (permalink / raw)
  To: 30408

[-- Attachment #1: Type: text/plain, Size: 970 bytes --]

I wrote this originally on
https://emacs.stackexchange.com/questions/38710/why-does-format-x-some-large-number-produces-incorrect-results
and a poster recommended I mention this here.

I wanted the hexadecimal string for a large integer such as below:

(format "%x" 2738188573457603759)

This returns 2600000000f95c00 which is incorrect, it should be
2600000000f95caf.

The value of most-positive-fixnum on my box is 0x1fffffffffffffff which is
less than the number I'm supplying above.

As a user I'm a bit baffled what is happening. The manual indicates
integers larger than this range are converted to a floating-point number
which is a concern for precision but I suspect this is what is biting me
here?

I should have known there was an issue with this number since normally I
evaluate them directly using eval-last-sexp and it didn't show the
octal/hex variants.. :)

I wonder why Emacs Lisp doesn't support bignums by default, so precision
would not be an issue?

[-- Attachment #2: Type: text/html, Size: 1193 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#30408: 24.5; (format "%x" large-number) produces incorrect results
  2018-02-10  6:22 bug#30408: 24.5; (format "%x" large-number) produces incorrect results David Sitsky
@ 2018-02-12  2:49 ` Paul Eggert
  2018-02-12  4:56   ` Drew Adams
  2018-05-06  6:29 ` bug#30408: More context in `read-integer-overflow-as-float'? Bastien
  1 sibling, 1 reply; 5+ messages in thread
From: Paul Eggert @ 2018-02-12  2:49 UTC (permalink / raw)
  To: David Sitsky; +Cc: 30408

> I wonder why Emacs Lisp doesn't support bignums by default, so precision
> would not be an issue?

Nobody has gotten around to implementing it. It'd be nice if someone did. It is 
a wishlist item, so I'll mark this bug report as wishlist priority.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#30408: 24.5; (format "%x" large-number) produces incorrect results
  2018-02-12  2:49 ` Paul Eggert
@ 2018-02-12  4:56   ` Drew Adams
  2018-02-18  1:08     ` Paul Eggert
  0 siblings, 1 reply; 5+ messages in thread
From: Drew Adams @ 2018-02-12  4:56 UTC (permalink / raw)
  To: Paul Eggert, David Sitsky; +Cc: 30408

> > I wonder why Emacs Lisp doesn't support bignums by default, so
> > precision would not be an issue?
> 
> Nobody has gotten around to implementing it. It'd be nice if someone did.
> It is a wishlist item, so I'll mark this bug report as wishlist priority.

It's probably a duplicate bug.  This has been requested
in the past - perhaps more than once.  And there has
been some discussion of it.  I don't have a reference,
however.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#30408: 24.5; (format "%x" large-number) produces incorrect results
  2018-02-12  4:56   ` Drew Adams
@ 2018-02-18  1:08     ` Paul Eggert
  0 siblings, 0 replies; 5+ messages in thread
From: Paul Eggert @ 2018-02-18  1:08 UTC (permalink / raw)
  To: Drew Adams, David Sitsky; +Cc: 30408

[-- Attachment #1: Type: text/plain, Size: 733 bytes --]

This kind of bug has bitten me before, so I think it's worthwhile for Emacs to 
defend against it better. Proposed patch attached. Although this patch doesn't 
address the major problem here (which is that Emacs lacks bignums), it does 
cause Emacs to respond better to large numbers, by not losing information when 
it is reading or printing integers.

With this patch, one cannot evaluate (format "%x" 2738188573457603759) because 
the Lisp reader signals an error when it sees the unrepresentable integer 
2738188573457603759, instead of silently substituting a different number. 
Another example: (format "%d" 18446744073709551616) now returns 
"18446744073709551616" instead of the quite-wrong "9223372036854775807".

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Avoid-losing-info-when-converting-integers.patch --]
[-- Type: text/x-patch; name="0001-Avoid-losing-info-when-converting-integers.patch", Size: 8127 bytes --]

From e1865be990e1a520feddc07507a71916d097d633 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 17 Feb 2018 16:45:17 -0800
Subject: [PATCH] Avoid losing info when converting integers

This fixes some glitches with large integers (Bug#30408).
* doc/lispref/numbers.texi (Integer Basics): Say that
decimal integers out of fixnum range must be representable
exactly as floating-point.
* etc/NEWS: Mention this.
* src/data.c (syms_of_data): Add Qinexact_error.
* src/editfns.c (styled_format): Use %.0f when formatting %d or %i
values outside machine integer range, to avoid losing info.
Signal an error for %o or %x values that are too large to be
formatted, to avoid losing info.
* src/lread.c (string_to_number): When converting an integer-format
string to floating-point, signal an error if info is lost.
---
 doc/lispref/numbers.texi |  8 +++--
 etc/NEWS                 |  9 +++++
 src/data.c               |  1 +
 src/editfns.c            | 93 ++++++++++++++++++++----------------------------
 src/lread.c              | 14 ++++++++
 5 files changed, 67 insertions(+), 58 deletions(-)

diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi
index e692ee1cc2..252aafd8fd 100644
--- a/doc/lispref/numbers.texi
+++ b/doc/lispref/numbers.texi
@@ -53,9 +53,11 @@ Integer Basics
 chapter assume the minimum integer width of 30 bits.
 @cindex overflow
 
-  The Lisp reader reads an integer as a sequence of digits with optional
-initial sign and optional final period.  An integer that is out of the
-Emacs range is treated as a floating-point number.
+  The Lisp reader can read an integer as a nonempty sequence of
+decimal digits with optional initial sign and optional final period.
+A decimal integer that is out of the Emacs range is treated as
+floating-point if it can be represented exactly as a floating-point
+number.
 
 @example
  1               ; @r{The integer 1.}
diff --git a/etc/NEWS b/etc/NEWS
index 8db638e5ed..36cbcf6500 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -248,6 +248,12 @@ as new-style, bind the new variable 'force-new-style-backquotes' to t.
 'cl-struct-define' whose name clashes with a builtin type (e.g.,
 'integer' or 'hash-table') now signals an error.
 
+** When formatting a floating-point number as an octal or hexadecimal
+integer, Emacs now signals an error if the number is too large for the
+implementation to format.  When reading an integer outside Emacs
+fixnum range, Emacs now signals an error if the integer cannot be
+represented exactly as a floating-point number.  See Bug#30408.
+
 \f
 * Lisp Changes in Emacs 27.1
 
@@ -289,6 +295,9 @@ remote systems, which support this check.
 If the optional third argument is non-nil, 'make-string' will produce
 a multibyte string even if its second argument is an ASCII character.
 
+** (format "%d" X) no longer mishandles floating-point X values that
+do not fit in a machine integer (Bug#30408).
+
 ** New JSON parsing and serialization functions 'json-serialize',
 'json-insert', 'json-parse-string', and 'json-parse-buffer'.  These
 are implemented in C using the Jansson library.
diff --git a/src/data.c b/src/data.c
index 72abfefb01..8856583f13 100644
--- a/src/data.c
+++ b/src/data.c
@@ -3729,6 +3729,7 @@ syms_of_data (void)
   DEFSYM (Qrange_error, "range-error");
   DEFSYM (Qdomain_error, "domain-error");
   DEFSYM (Qsingularity_error, "singularity-error");
+  DEFSYM (Qinexact_error, "inexact-error");
   DEFSYM (Qoverflow_error, "overflow-error");
   DEFSYM (Qunderflow_error, "underflow-error");
 
diff --git a/src/editfns.c b/src/editfns.c
index 96bb271b2d..d26549ddb8 100644
--- a/src/editfns.c
+++ b/src/editfns.c
@@ -4563,32 +4563,30 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message)
 		 and with pM inserted for integer formats.
 		 At most two flags F can be specified at once.  */
 	      char convspec[sizeof "%FF.*d" + max (INT_AS_LDBL, pMlen)];
-	      {
-		char *f = convspec;
-		*f++ = '%';
-		/* MINUS_FLAG and ZERO_FLAG are dealt with later.  */
-		*f = '+'; f +=  plus_flag;
-		*f = ' '; f += space_flag;
-		*f = '#'; f += sharp_flag;
-                *f++ = '.';
-                *f++ = '*';
-		if (float_conversion)
-		  {
-		    if (INT_AS_LDBL)
-		      {
-			*f = 'L';
-			f += INTEGERP (arg);
-		      }
-		  }
-		else if (conversion != 'c')
-		  {
-		    memcpy (f, pMd, pMlen);
-		    f += pMlen;
-		    zero_flag &= ! precision_given;
-		  }
-		*f++ = conversion;
-		*f = '\0';
-	      }
+	      char *f = convspec;
+	      *f++ = '%';
+	      /* MINUS_FLAG and ZERO_FLAG are dealt with later.  */
+	      *f = '+'; f +=  plus_flag;
+	      *f = ' '; f += space_flag;
+	      *f = '#'; f += sharp_flag;
+	      *f++ = '.';
+	      *f++ = '*';
+	      if (float_conversion)
+		{
+		  if (INT_AS_LDBL)
+		    {
+		      *f = 'L';
+		      f += INTEGERP (arg);
+		    }
+		}
+	      else if (conversion != 'c')
+		{
+		  memcpy (f, pMd, pMlen);
+		  f += pMlen;
+		  zero_flag &= ! precision_given;
+		}
+	      *f++ = conversion;
+	      *f = '\0';
 
 	      int prec = -1;
 	      if (precision_given)
@@ -4630,29 +4628,18 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message)
 		}
 	      else if (conversion == 'd' || conversion == 'i')
 		{
-		  /* For float, maybe we should use "%1.0f"
-		     instead so it also works for values outside
-		     the integer range.  */
-		  printmax_t x;
 		  if (INTEGERP (arg))
-		    x = XINT (arg);
+		    {
+		      printmax_t x = XINT (arg);
+		      sprintf_bytes = sprintf (sprintf_buf, convspec, prec, x);
+		    }
 		  else
 		    {
-		      double d = XFLOAT_DATA (arg);
-		      if (d < 0)
-			{
-			  x = TYPE_MINIMUM (printmax_t);
-			  if (x < d)
-			    x = d;
-			}
-		      else
-			{
-			  x = TYPE_MAXIMUM (printmax_t);
-			  if (d < x)
-			    x = d;
-			}
+		      strcpy (f - pMlen - 1, "f");
+		      prec = 0;
+		      double x = XFLOAT_DATA (arg);
+		      sprintf_bytes = sprintf (sprintf_buf, convspec, prec, x);
 		    }
-		  sprintf_bytes = sprintf (sprintf_buf, convspec, prec, x);
 		}
 	      else
 		{
@@ -4663,22 +4650,18 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message)
 		  else
 		    {
 		      double d = XFLOAT_DATA (arg);
-		      if (d < 0)
-			x = 0;
-		      else
-			{
-			  x = TYPE_MAXIMUM (uprintmax_t);
-			  if (d < x)
-			    x = d;
-			}
+		      if (! (0 <= d && d < TYPE_MAXIMUM (uprintmax_t)))
+			xsignal1 (Qoverflow_error, arg);
+		      x = d;
 		    }
 		  sprintf_bytes = sprintf (sprintf_buf, convspec, prec, x);
 		}
 
 	      /* Now the length of the formatted item is known, except it omits
 		 padding and excess precision.  Deal with excess precision
-		 first.  This happens only when the format specifies
-		 ridiculously large precision.  */
+		 first.  This happens when the format specifies
+		 ridiculously large precision, or when %d or %i has
+		 nonzero precision and formats a float.  */
 	      ptrdiff_t excess_precision
 		= precision_given ? precision - prec : 0;
 	      ptrdiff_t leading_zeros = 0, trailing_zeros = 0;
diff --git a/src/lread.c b/src/lread.c
index d009bd0cd2..cfeaac8030 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -3794,6 +3794,20 @@ string_to_number (char const *string, int base, bool ignore_trailing)
   if (! value)
     value = atof (string + signedp);
 
+  if (! float_syntax)
+    {
+      /* Check that converting the integer-format STRING to a
+	 floating-point number does not lose info.  See Bug#30408.  */
+      char const *bp = string + signedp;
+      while (*bp == '0')
+	bp++;
+      char checkbuf[DBL_MAX_10_EXP + 2];
+      int checkbuflen = sprintf (checkbuf, "%.0f", value);
+      if (! (cp - bp - !!(state & DOT_CHAR) == checkbuflen
+	     && memcmp (bp, checkbuf, checkbuflen) == 0))
+	xsignal1 (Qinexact_error, build_string (string));
+    }
+
   return make_float (negative ? -value : value);
 }
 
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* bug#30408: More context in `read-integer-overflow-as-float'?
  2018-02-10  6:22 bug#30408: 24.5; (format "%x" large-number) produces incorrect results David Sitsky
  2018-02-12  2:49 ` Paul Eggert
@ 2018-05-06  6:29 ` Bastien
  1 sibling, 0 replies; 5+ messages in thread
From: Bastien @ 2018-05-06  6:29 UTC (permalink / raw)
  To: 30408

As suggested in `read-integer-overflow-as-float' docstring:

  Non-nil means ‘read’ quietly treats an out-of-range integer as
  floating point.  Nil (the default) means signal an overflow unless
  the integer ends in ‘.’.  This variable is experimental; email
  30408@debbugs.gnu.org if you need it.

(Note that the last sentence is a bit ambiguous: does "if you need it"
refer to sending an email or to the variable?)

Apparently I need (setq read-integer-overflow-as-float t) since I've
been hit by bugs here.  But what are the consequences of setting this
to `t', aside from silencing a few errors?  What is the usefulness of
not setting it to `t'?  What is the experiment about?  Can I get rid
of this setting when the experiment is over?

Since `read-integer-overflow-as-float' is the entry point for those
who are not aware of the experiment, some guidance in the docstring
might be useful.

Thanks,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-05-06  6:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-10  6:22 bug#30408: 24.5; (format "%x" large-number) produces incorrect results David Sitsky
2018-02-12  2:49 ` Paul Eggert
2018-02-12  4:56   ` Drew Adams
2018-02-18  1:08     ` Paul Eggert
2018-05-06  6:29 ` bug#30408: More context in `read-integer-overflow-as-float'? Bastien

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).