* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
@ 2021-05-26 16:56 Mattias Engdegård
2021-05-26 17:27 ` Eli Zaretskii
0 siblings, 1 reply; 12+ messages in thread
From: Mattias Engdegård @ 2021-05-26 16:56 UTC (permalink / raw)
To: 48678
[-- Attachment #1: Type: text/plain, Size: 915 bytes --]
Motivation: I poured lots of numeric data into Emacs for a computation, but the result weren't as expected at all. Yet my code was correct, and so was the data.
After hours of debugging, it turned out that Emacs reads a number like 1.e6 as the integer 1, not the float 1000000.0. The exponent is silently ignored!
Now Emacs has always treated numbers like 123. as integers rather than floats, but
(1) it's documented,
(2) it's what Common Lisp does, and
(3) it actually doesn't affect the numeric value most of the time.
(Common Lisp probably got this from Maclisp, the rationale being that a trailing dot can be used to write integers in base 10 even when the current input radix is set to something else, something that Emacs Lisp doesn't need.)
Obviously this doesn't apply to 1.e6 which any sane person agrees is the float 1.0e+6 (including Common Lisp).
The attached patch fixes this bug.
[-- Attachment #2: 0001-Fix-lexing-of-numbers-with-trailing-decimal-point-an.patch --]
[-- Type: application/octet-stream, Size: 5931 bytes --]
From a0b69a9fc17c42b0c15b28c5894ffb2a1a9327e3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Thu, 20 May 2021 18:26:15 +0200
Subject: [PATCH] Fix lexing of numbers with trailing decimal point and
exponent
Numbers with a trailing dot and an exponent were incorrectly read as
integers (with the exponent ignored) instead of the floats they should
be. For example, 1.e6 was read as the integer 1, not 1000000.0 as
every sane person would agree was meant.
Numbers with a trailing dot but no exponent are still read as
integers.
* src/lread.c (string_to_number): Fix float lexing.
* test/src/lread-tests.el (lread-float): Add test.
* doc/lispref/numbers.texi (Float Basics): Clarify syntax.
---
doc/lispref/numbers.texi | 3 +-
src/lread.c | 10 +++---
test/src/lread-tests.el | 67 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 74 insertions(+), 6 deletions(-)
diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi
index 4c5f72126e..d28e15869a 100644
--- a/doc/lispref/numbers.texi
+++ b/doc/lispref/numbers.texi
@@ -237,7 +237,8 @@ Float Basics
@samp{+15e2}, @samp{15.0e+2}, @samp{+1500000e-3}, and @samp{.15e4} are
five ways of writing a floating-point number whose value is 1500.
They are all equivalent. Like Common Lisp, Emacs Lisp requires at
-least one digit after any decimal point in a floating-point number;
+least one digit after a decimal point in a floating-point number that
+does not have an exponent;
@samp{1500.} is an integer, not a floating-point number.
Emacs Lisp treats @code{-0.0} as numerically equal to ordinary zero
diff --git a/src/lread.c b/src/lread.c
index bca53a9a37..0b33fd0f25 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -3938,8 +3938,7 @@ string_to_number (char const *string, int base, ptrdiff_t *plen)
bool signedp = negative | positive;
cp += signedp;
- enum { INTOVERFLOW = 1, LEAD_INT = 2, DOT_CHAR = 4, TRAIL_INT = 8,
- E_EXP = 16 };
+ enum { INTOVERFLOW = 1, LEAD_INT = 2, TRAIL_INT = 4, E_EXP = 16 };
int state = 0;
int leading_digit = digit_to_number (*cp, base);
uintmax_t n = leading_digit;
@@ -3959,7 +3958,6 @@ string_to_number (char const *string, int base, ptrdiff_t *plen)
char const *after_digits = cp;
if (*cp == '.')
{
- state |= DOT_CHAR;
cp++;
}
@@ -4008,8 +4006,10 @@ string_to_number (char const *string, int base, ptrdiff_t *plen)
cp = ecp;
}
- float_syntax = ((state & (DOT_CHAR|TRAIL_INT)) == (DOT_CHAR|TRAIL_INT)
- || (state & ~INTOVERFLOW) == (LEAD_INT|E_EXP));
+ /* A float has digits after the dot or an exponent.
+ This excludes numbers like "1." which are lexed as integers. */
+ float_syntax = ((state & TRAIL_INT)
+ || ((state & LEAD_INT) && (state & E_EXP)));
}
if (plen)
diff --git a/test/src/lread-tests.el b/test/src/lread-tests.el
index f2a60bcf32..dac8f95bc4 100644
--- a/test/src/lread-tests.el
+++ b/test/src/lread-tests.el
@@ -196,4 +196,71 @@ test-inhibit-interaction
(should-error (read-event "foo: "))
(should-error (read-char-exclusive "foo: "))))
+(ert-deftest lread-float ()
+ (should (equal (read "13") 13))
+ (should (equal (read "+13") 13))
+ (should (equal (read "-13") -13))
+ (should (equal (read "13.") 13))
+ (should (equal (read "+13.") 13))
+ (should (equal (read "-13.") -13))
+ (should (equal (read "13.25") 13.25))
+ (should (equal (read "+13.25") 13.25))
+ (should (equal (read "-13.25") -13.25))
+ (should (equal (read ".25") 0.25))
+ (should (equal (read "+.25") 0.25))
+ (should (equal (read "-.25") -0.25))
+ (should (equal (read "13e4") 130000.0))
+ (should (equal (read "+13e4") 130000.0))
+ (should (equal (read "-13e4") -130000.0))
+ (should (equal (read "13e+4") 130000.0))
+ (should (equal (read "+13e+4") 130000.0))
+ (should (equal (read "-13e+4") -130000.0))
+ (should (equal (read "625e-4") 0.0625))
+ (should (equal (read "+625e-4") 0.0625))
+ (should (equal (read "-625e-4") -0.0625))
+ (should (equal (read "1.25e2") 125.0))
+ (should (equal (read "+1.25e2") 125.0))
+ (should (equal (read "-1.25e2") -125.0))
+ (should (equal (read "1.25e+2") 125.0))
+ (should (equal (read "+1.25e+2") 125.0))
+ (should (equal (read "-1.25e+2") -125.0))
+ (should (equal (read "1.25e-1") 0.125))
+ (should (equal (read "+1.25e-1") 0.125))
+ (should (equal (read "-1.25e-1") -0.125))
+ (should (equal (read "4.e3") 4000.0))
+ (should (equal (read "+4.e3") 4000.0))
+ (should (equal (read "-4.e3") -4000.0))
+ (should (equal (read "4.e+3") 4000.0))
+ (should (equal (read "+4.e+3") 4000.0))
+ (should (equal (read "-4.e+3") -4000.0))
+ (should (equal (read "5.e-1") 0.5))
+ (should (equal (read "+5.e-1") 0.5))
+ (should (equal (read "-5.e-1") -0.5))
+ (should (equal (read "0") 0))
+ (should (equal (read "+0") 0))
+ (should (equal (read "-0") 0))
+ (should (equal (read "0.") 0))
+ (should (equal (read "+0.") 0))
+ (should (equal (read "-0.") 0))
+ (should (equal (read "0.0") 0.0))
+ (should (equal (read "+0.0") 0.0))
+ (should (equal (read "-0.0") -0.0))
+ (should (equal (read "0e5") 0.0))
+ (should (equal (read "+0e5") 0.0))
+ (should (equal (read "-0e5") -0.0))
+ (should (equal (read "0e-5") 0.0))
+ (should (equal (read "+0e-5") 0.0))
+ (should (equal (read "-0e-5") -0.0))
+ (should (equal (read ".0e-5") 0.0))
+ (should (equal (read "+.0e-5") 0.0))
+ (should (equal (read "-.0e-5") -0.0))
+ (should (equal (read "0.0e-5") 0.0))
+ (should (equal (read "+0.0e-5") 0.0))
+ (should (equal (read "-0.0e-5") -0.0))
+ (should (equal (read "0.e-5") 0.0))
+ (should (equal (read "+0.e-5") 0.0))
+ (should (equal (read "-0.e-5") -0.0))
+ )
+
+
;;; lread-tests.el ends here
--
2.21.1 (Apple Git-122.3)
^ permalink raw reply related [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-26 16:56 bug#48678: [PATCH] lex floats with trailing dot and exponent correctly Mattias Engdegård
@ 2021-05-26 17:27 ` Eli Zaretskii
2021-05-26 22:13 ` Lars Ingebrigtsen
2021-05-27 12:20 ` Mattias Engdegård
0 siblings, 2 replies; 12+ messages in thread
From: Eli Zaretskii @ 2021-05-26 17:27 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: 48678
> From: Mattias Engdegård <mattiase@acm.org>
> Date: Wed, 26 May 2021 18:56:43 +0200
>
> Now Emacs has always treated numbers like 123. as integers rather than floats, but
> (1) it's documented,
> (2) it's what Common Lisp does, and
> (3) it actually doesn't affect the numeric value most of the time.
>
> (Common Lisp probably got this from Maclisp, the rationale being that a trailing dot can be used to write integers in base 10 even when the current input radix is set to something else, something that Emacs Lisp doesn't need.)
>
> Obviously this doesn't apply to 1.e6 which any sane person agrees is the float 1.0e+6 (including Common Lisp).
>
> The attached patch fixes this bug.
Brace for massive breakage.
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-26 17:27 ` Eli Zaretskii
@ 2021-05-26 22:13 ` Lars Ingebrigtsen
2021-05-27 7:32 ` Andreas Schwab
2021-05-27 12:20 ` Mattias Engdegård
1 sibling, 1 reply; 12+ messages in thread
From: Lars Ingebrigtsen @ 2021-05-26 22:13 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Mattias Engdegård, 48678
Eli Zaretskii <eliz@gnu.org> writes:
>> Obviously this doesn't apply to 1.e6 which any sane person agrees is
>> the float 1.0e+6 (including Common Lisp).
>>
>> The attached patch fixes this bug.
>
> Brace for massive breakage.
Yes, it's a rather scary change -- people will have code that sloppily
parses noisy things like "1foo" and expect to get a 1 out, and ".e6"
could well be noise that they expect to have ignored.
So I'm sceptical.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-26 22:13 ` Lars Ingebrigtsen
@ 2021-05-27 7:32 ` Andreas Schwab
2021-05-27 7:40 ` Lars Ingebrigtsen
0 siblings, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2021-05-27 7:32 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: Mattias Engdegård, 48678
On Mai 27 2021, Lars Ingebrigtsen wrote:
> Yes, it's a rather scary change -- people will have code that sloppily
> parses noisy things like "1foo" and expect to get a 1 out, and ".e6"
> could well be noise that they expect to have ignored.
>
> So I'm sceptical.
But then 1.e6 should be parsed as a symbol.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-27 7:32 ` Andreas Schwab
@ 2021-05-27 7:40 ` Lars Ingebrigtsen
2021-05-27 12:28 ` Mattias Engdegård
2021-05-27 12:36 ` Philipp Stephani
0 siblings, 2 replies; 12+ messages in thread
From: Lars Ingebrigtsen @ 2021-05-27 7:40 UTC (permalink / raw)
To: Andreas Schwab; +Cc: Mattias Engdegård, 48678
Andreas Schwab <schwab@linux-m68k.org> writes:
> On Mai 27 2021, Lars Ingebrigtsen wrote:
>
>> Yes, it's a rather scary change -- people will have code that sloppily
>> parses noisy things like "1foo" and expect to get a 1 out, and ".e6"
>> could well be noise that they expect to have ignored.
>>
>> So I'm sceptical.
>
> But then 1.e6 should be parsed as a symbol.
Oops, I thought this was about string-to-number, which it wasn't at all.
Hm. Currently 1.e6 reads to 1? Weird.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-26 17:27 ` Eli Zaretskii
2021-05-26 22:13 ` Lars Ingebrigtsen
@ 2021-05-27 12:20 ` Mattias Engdegård
2021-05-29 6:03 ` Lars Ingebrigtsen
1 sibling, 1 reply; 12+ messages in thread
From: Mattias Engdegård @ 2021-05-27 12:20 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 48678
26 maj 2021 kl. 19.27 skrev Eli Zaretskii <eliz@gnu.org>:
> Brace for massive breakage.
Challenge accepted! Now in master. Bring it on!
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-27 7:40 ` Lars Ingebrigtsen
@ 2021-05-27 12:28 ` Mattias Engdegård
2021-05-27 12:36 ` Philipp Stephani
1 sibling, 0 replies; 12+ messages in thread
From: Mattias Engdegård @ 2021-05-27 12:28 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: Andreas Schwab, 48678
27 maj 2021 kl. 09.40 skrev Lars Ingebrigtsen <larsi@gnus.org>:
> Hm. Currently 1.e6 reads to 1? Weird.
Yes, this behaviour was probably not intended at all. Such things happen; the best we can do is to put things right.
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-27 7:40 ` Lars Ingebrigtsen
2021-05-27 12:28 ` Mattias Engdegård
@ 2021-05-27 12:36 ` Philipp Stephani
2021-05-27 12:37 ` Philipp Stephani
1 sibling, 1 reply; 12+ messages in thread
From: Philipp Stephani @ 2021-05-27 12:36 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: Mattias Engdegård, Andreas Schwab, 48678
Am Do., 27. Mai 2021 um 09:41 Uhr schrieb Lars Ingebrigtsen <larsi@gnus.org>:
>
> Andreas Schwab <schwab@linux-m68k.org> writes:
>
> > On Mai 27 2021, Lars Ingebrigtsen wrote:
> >
> >> Yes, it's a rather scary change -- people will have code that sloppily
> >> parses noisy things like "1foo" and expect to get a 1 out, and ".e6"
> >> could well be noise that they expect to have ignored.
> >>
> >> So I'm sceptical.
> >
> > But then 1.e6 should be parsed as a symbol.
>
> Oops, I thought this was about string-to-number, which it wasn't at all.
>
> Hm. Currently 1.e6 reads to 1? Weird.
At least for me it's parsed as a symbol.
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-27 12:36 ` Philipp Stephani
@ 2021-05-27 12:37 ` Philipp Stephani
0 siblings, 0 replies; 12+ messages in thread
From: Philipp Stephani @ 2021-05-27 12:37 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: Mattias Engdegård, Andreas Schwab, 48678
Am Do., 27. Mai 2021 um 14:36 Uhr schrieb Philipp Stephani
<p.stephani2@gmail.com>:
>
> Am Do., 27. Mai 2021 um 09:41 Uhr schrieb Lars Ingebrigtsen <larsi@gnus.org>:
> >
> > Andreas Schwab <schwab@linux-m68k.org> writes:
> >
> > > On Mai 27 2021, Lars Ingebrigtsen wrote:
> > >
> > >> Yes, it's a rather scary change -- people will have code that sloppily
> > >> parses noisy things like "1foo" and expect to get a 1 out, and ".e6"
> > >> could well be noise that they expect to have ignored.
> > >>
> > >> So I'm sceptical.
> > >
> > > But then 1.e6 should be parsed as a symbol.
> >
> > Oops, I thought this was about string-to-number, which it wasn't at all.
> >
> > Hm. Currently 1.e6 reads to 1? Weird.
>
> At least for me it's parsed as a symbol.
Oops, taking that back, I checked 1e.6 instead of 1.e6.
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-27 12:20 ` Mattias Engdegård
@ 2021-05-29 6:03 ` Lars Ingebrigtsen
2021-05-29 7:47 ` Mattias Engdegård
0 siblings, 1 reply; 12+ messages in thread
From: Lars Ingebrigtsen @ 2021-05-29 6:03 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: 48678
Mattias Engdegård <mattiase@acm.org> writes:
> 26 maj 2021 kl. 19.27 skrev Eli Zaretskii <eliz@gnu.org>:
>
>> Brace for massive breakage.
>
> Challenge accepted! Now in master. Bring it on!
:-)
There didn't seem to be any reported breakages from this yet. It does
seem quite NEWS-worthy, though, so I've added an entry, and I'm closing
this bug report. If serious breakages do happen, we should consider
backing out the change.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-29 6:03 ` Lars Ingebrigtsen
@ 2021-05-29 7:47 ` Mattias Engdegård
2021-05-30 4:07 ` Lars Ingebrigtsen
0 siblings, 1 reply; 12+ messages in thread
From: Mattias Engdegård @ 2021-05-29 7:47 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 48678
29 maj 2021 kl. 08.03 skrev Lars Ingebrigtsen <larsi@gnus.org>:
> There didn't seem to be any reported breakages from this yet. It does
> seem quite NEWS-worthy, though, so I've added an entry, and I'm closing
> this bug report.
Excellent! I was going to write a NEWS entry, so thank you for forcing my hand. I took the liberty to make a few minor changes to it for precision; if it isn't to your liking, do tell.
> If serious breakages do happen, we should consider
> backing out the change.
Most certainly, but I'm confident in the change. It wasn't done without serious preparation: I scanned hundreds of Emacs packages, and checked all boolean combinations in the reader condition to guarantee correctness (which showed that a flag in the condition was redundant and could be removed). There is now a serious test.
Looking for the origin I also ran Maclisp on a PDP-10 and can confirm that it does not have the bug, so it must have been endogenous to Emacs.
^ permalink raw reply [flat|nested] 12+ messages in thread
* bug#48678: [PATCH] lex floats with trailing dot and exponent correctly
2021-05-29 7:47 ` Mattias Engdegård
@ 2021-05-30 4:07 ` Lars Ingebrigtsen
0 siblings, 0 replies; 12+ messages in thread
From: Lars Ingebrigtsen @ 2021-05-30 4:07 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: 48678
Mattias Engdegård <mattiase@acm.org> writes:
> Excellent! I was going to write a NEWS entry, so thank you for forcing
> my hand. I took the liberty to make a few minor changes to it for
> precision; if it isn't to your liking, do tell.
Looks good to me; thanks.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-05-30 4:07 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-26 16:56 bug#48678: [PATCH] lex floats with trailing dot and exponent correctly Mattias Engdegård
2021-05-26 17:27 ` Eli Zaretskii
2021-05-26 22:13 ` Lars Ingebrigtsen
2021-05-27 7:32 ` Andreas Schwab
2021-05-27 7:40 ` Lars Ingebrigtsen
2021-05-27 12:28 ` Mattias Engdegård
2021-05-27 12:36 ` Philipp Stephani
2021-05-27 12:37 ` Philipp Stephani
2021-05-27 12:20 ` Mattias Engdegård
2021-05-29 6:03 ` Lars Ingebrigtsen
2021-05-29 7:47 ` Mattias Engdegård
2021-05-30 4:07 ` Lars Ingebrigtsen
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.