unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Procps in core-updates
@ 2023-03-18 12:32 Andreas Enge
  2023-03-18 17:38 ` Leo Famulari
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Andreas Enge @ 2023-03-18 12:32 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1251 bytes --]

Hello,

procps fails its tests on core-updates when built with --system=i686-linux
(it succeeds on x86_64).

The error is completely not understandable (and Debian has disabled the test).
I modified the test file as attached; my changes are to compute the value val
separately before the "if" (interestingly enough, the variable declaration
"double val;" was already there), and I added the fprintf at the beginnings
of the lines.

The test then fails and prints
CMP 1
CMP 0
DIF 0.000000
FAIL: strtod_nol_or_err("123") != 123.000000

So val != tests[i].result when the "if" is executed,
this is still the case for the first fprintf,
and then changes for the second fprintf.

Has anyone got any explanation for this behaviour? A compiler error?

Here:
   https://www.freelists.org/post/procps/strtod-nol-or-err-on-32bit-Was-procps-3312-released,2
the conclusion is
"Ive got a pending patch to just remove it. Floating point math sucks when
it comes to equality."
but this is not even floating point maths - whatever the contents of val
and tests[i].result, they should not be changed by a comparison (or an
fprintf; I can also make the test work just by adding some printf into
the strtod_nol_or_err function that is exercised by this test).

Andreas


[-- Attachment #2: test_strtod_nol.c --]
[-- Type: text/plain, Size: 1139 bytes --]


#include <stdio.h>
#include <stdlib.h>
#include "strutils.h"

struct strtod_tests {
    char *string;
    double result;
};

struct strtod_tests tests[] = {
    {"123",     123.0},
    {"-123",    -123.0},
    {"12.34",   12.34},
    {"-12.34",  -12.34},
    {".34",     0.34},
    {"-.34",    -0.34},
    {"12,34",   12.34},
    {"-12,34",  -12.34},
    {",34",     0.34},
    {"-,34",    -0.34},
    {"0",       0.0},
    {".0",      0.0},
    {"0.0",     0.0},
    {NULL, 0.0}
};



int main(int argc, char *argv[])
{
    int i;
    double val;

    for(i=0; tests[i].string != NULL; i++) {
        val = strtod_nol_or_err(tests[i].string, "Cannot parse number");
        if (val != tests[i].result) {
fprintf(stderr, "CMP %i\n", val != tests[i].result);
fprintf(stderr, "CMP %i\n", val != tests[i].result);
fprintf(stderr, "DIF %f\n", val - tests[i].result);
            fprintf(stderr, "FAIL: strtod_nol_or_err(\"%s\") != %f\n",
                    tests[i].string, tests[i].result);
            return EXIT_FAILURE;
        }
        //fprintf(stderr, "PASS: strtod_nol for %s\n", tests[i].string);
    }
    return EXIT_SUCCESS;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Procps in core-updates
  2023-03-18 12:32 Procps in core-updates Andreas Enge
@ 2023-03-18 17:38 ` Leo Famulari
  2023-03-18 17:44   ` Andreas Enge
  2023-03-19 17:36 ` Michael Schierl
  2023-03-19 17:49 ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
  2 siblings, 1 reply; 9+ messages in thread
From: Leo Famulari @ 2023-03-18 17:38 UTC (permalink / raw)
  To: Andreas Enge; +Cc: guix-devel

I recommend trying the latest upstream version, 4.0.3. The Sourceforge
download page points to the new canonical location:

https://gitlab.com/procps-ng/procps

If that doesn't work, I would just disable the test, unless there is
some authoritative upstream opinion about which patch to apply.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Procps in core-updates
  2023-03-18 17:38 ` Leo Famulari
@ 2023-03-18 17:44   ` Andreas Enge
  0 siblings, 0 replies; 9+ messages in thread
From: Andreas Enge @ 2023-03-18 17:44 UTC (permalink / raw)
  To: Leo Famulari; +Cc: guix-devel

Am Sat, Mar 18, 2023 at 01:38:48PM -0400 schrieb Leo Famulari:
> I recommend trying the latest upstream version, 4.0.3. The Sourceforge
> download page points to the new canonical location:

I already tried, it does not change anything. If I mess with the package,
I would in any case update the version - it requires about 7000 rebuilds,
so we may as well use it for an update.

> If that doesn't work, I would just disable the test, unless there is
> some authoritative upstream opinion about which patch to apply.

Sure, I can always do. I am just wondering about how the test failure
is even possible...

Andreas



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Procps in core-updates
  2023-03-18 12:32 Procps in core-updates Andreas Enge
  2023-03-18 17:38 ` Leo Famulari
@ 2023-03-19 17:36 ` Michael Schierl
  2023-03-20 14:54   ` Andreas Enge
  2023-03-19 17:49 ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
  2 siblings, 1 reply; 9+ messages in thread
From: Michael Schierl @ 2023-03-19 17:36 UTC (permalink / raw)
  To: Andreas Enge; +Cc: guix-devel

Hello Andreas,


Am 18.03.2023 um 13:32 schrieb Andreas Enge:

> Has anyone got any explanation for this behaviour? A compiler error?

Nasal daemons are not a compiler error.

Anyway, getting an assembly listing from your gcc version (using the -S
switch) would reduce the guesswork. Pass the same compiler flags as the
test does. In particular, various options of "-fexcess-precision" flag
may alter the generated assembly code and the test outcome.

> but this is not even floating point maths - whatever the contents of val
> and tests[i].result, they should not be changed by a comparison (or an
> fprintf; I can also make the test work just by adding some printf into
> the strtod_nol_or_err function that is exercised by this test).

Wild guess:

At the first printf, GCC knows that val still resides in some floating
point register of your CPU (be it SSE, MMX or x87 registers, depending
on the processor models your gcc targets).

Hardware floating point registers on x86 are a mess, and they usually do
not have the same precision as the IEEE floating point values that are
stored in variables (e.g. on the stack or heap), but are slightly more
precise.

So, at your first printf, gcc will create code that compares that
floating point register to the value in the array (loaded into another
floating point register of the same type).

As the function call will likely clobber some floating point registers
(depending on calling convention), for the second printf, gcc will have
to create code that loads the value from stack (with reduced precision)
back into the floating point register, reducing the precision of the
value and making it equal again.

Adding "asm volatile" barriers with matching clobbers flags (tbh I don't
know by heart what clobbers flags you need to pass to mark *any* x86
floating point registers as clobbered) should also make the nasal
deamons disappear, in case tweaking the compiler flags is not an option.


Regards,


Michael



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Procps in core-updates
  2023-03-18 12:32 Procps in core-updates Andreas Enge
  2023-03-18 17:38 ` Leo Famulari
  2023-03-19 17:36 ` Michael Schierl
@ 2023-03-19 17:49 ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
  2023-03-19 19:22   ` Kaelyn
  2 siblings, 1 reply; 9+ messages in thread
From: Felix Lechner via Development of GNU Guix and the GNU System distribution. @ 2023-03-19 17:49 UTC (permalink / raw)
  To: Andreas Enge; +Cc: guix-devel

Hi Andreas,

On Sat, Mar 18, 2023 at 5:33 AM Andreas Enge <andreas@enge.fr> wrote:
>
> FAIL: strtod_nol_or_err("123") != 123.000000

Can you multiply by "1.0" to force a floating-point comparison, or
round the other side to the nearest int?

Kind regards
Felix


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Procps in core-updates
  2023-03-19 17:49 ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
@ 2023-03-19 19:22   ` Kaelyn
  0 siblings, 0 replies; 9+ messages in thread
From: Kaelyn @ 2023-03-19 19:22 UTC (permalink / raw)
  To: Felix Lechner; +Cc: Andreas Enge, guix-devel

Hi Felix,

------- Original Message -------
On Sunday, March 19th, 2023 at 5:49 PM, Felix Lechner via "Development of GNU Guix and the GNU System distribution." <guix-devel@gnu.org> wrote:


> 
> 
> Hi Andreas,
> 
> On Sat, Mar 18, 2023 at 5:33 AM Andreas Enge andreas@enge.fr wrote:
> 
> > FAIL: strtod_nol_or_err("123") != 123.000000
> 
> 
> Can you multiply by "1.0" to force a floating-point comparison, or
> round the other side to the nearest int?

Judging from the function name ("strtod_nol_or_err", which seems in line with the standard "strtof"/"strtod"/"strtold" conversion functions), the function returns a double value. Depending on the specific code, it could be encountering inconsistencies comparing a double to a float (non-double) constant due to the difference in types' precision.

Cheers,
Kaelyn

> 
> Kind regards
> Felix


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Procps in core-updates
  2023-03-19 17:36 ` Michael Schierl
@ 2023-03-20 14:54   ` Andreas Enge
  2023-03-20 20:02     ` Michael Schierl
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Enge @ 2023-03-20 14:54 UTC (permalink / raw)
  To: Michael Schierl; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1736 bytes --]

Hello Michael,

Am Sun, Mar 19, 2023 at 06:36:09PM +0100 schrieb Michael Schierl:
> At the first printf, GCC knows that val still resides in some floating
> point register of your CPU (be it SSE, MMX or x87 registers, depending
> on the processor models your gcc targets).
> Hardware floating point registers on x86 are a mess, and they usually do
> not have the same precision as the IEEE floating point values that are
> stored in variables (e.g. on the stack or heap), but are slightly more
> precise.

this is an interesting guess, and at least gives a hint of why things
happen. I am attaching a simplified C file and the corresponding assembly
output; which I cannot read, but there are differences between the two
invocations of fprintf.

Now to me this still looks like a GCC bug, which is supposed to honour
the IEEE 754 standard for floating point as specified in the C99 standard
(the compiler flag -std=gnu99 was used to compile the file) at least for
"simple" operations. Comparisons should be exact. But I found this:
   https://gcc.gnu.org/wiki/FloatingPointMath
"Note on x86 and m68080 floating-point math
For legacy x86 processors without SSE2 support, and for m68080 processors,
GCC is only able to fully comply with IEEE 754 semantics for the IEEE
double extended (long double) type." So this appears to be a bug with a
"wontfix" tag...

And as indicated there, the compiler option -ffloat-store makes the problem
go away!

The weirdest thing is that actually the value is the integer 123. So it
should be stored in any kind of register exactly.

Thanks for making me learn something new!

Actually I also found an upstream patch:
   https://gitlab.com/procps-ng/procps/-/issues/271
which I will apply now.

Andreas


[-- Attachment #2: test_strtod_nol.c --]
[-- Type: text/plain, Size: 460 bytes --]


#include <stdio.h>
#include <stdlib.h>
#include "strutils.h"

struct strtod_tests {
    char *string;
    long double result;
};

struct strtod_tests tests[] = {
    {"123",     123.0},
};



int main(int argc, char *argv[])
{
    long double val;

    val = strtod_nol_or_err(tests[0].string, "Cannot parse number");
    fprintf(stderr, "CMP %i\n", val != tests[0].result);
    fprintf(stderr, "CMP %i\n", val != tests[0].result);
    return EXIT_FAILURE;
}

[-- Attachment #3: test_strtod_nol.o --]
[-- Type: text/plain, Size: 394 bytes --]

\x7fELF\x01\x01\x01?J?l??_?W?\x02Yo?-??5?\x18?\x1eN\vS?}r?t?}7??"?\x04?7?P?Z\x13?(]a\x13E@???OxI^@J+\x0e??C8:?3n?v??\x17]?^[?\x0e\x11?	?('f???{????1Ru(?????F??\x04??\x06?????^[??{?KZed??/q\x1c?E'?????,gf_?k?6???>?/<gmn^?8\x18??:f\x04?\a?
\x7f???????edl???HO8Gd\f???\f\x1d?e?\x7f\x01?^[`h\x01%? #??o,Z?\x05J?0D?\a\)k???e:j?.j????uu^[???p??CU?\x03)![q\x7f\x19B?Z\x06?L??M&L?`\x0f?*Q\x19??\x18~.*G\x7f|s?}???e\x1aO?R???D?06?????\x1c(94_v?U\f???????&\x15?{???=\x12?\x10???????3?x???_\x1e????J?^[\x19??u\x1flC??\x01--q2qMYh7Y/44Ye03h--

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Procps in core-updates
  2023-03-20 14:54   ` Andreas Enge
@ 2023-03-20 20:02     ` Michael Schierl
  2023-03-20 21:25       ` Andreas Enge
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Schierl @ 2023-03-20 20:02 UTC (permalink / raw)
  To: Andreas Enge; +Cc: guix-devel

Hello Andreas,

Am 20.03.2023 um 15:54 schrieb Andreas Enge:
> Hello Michael,
>
> I am attaching a simplified C file and the corresponding assembly
> output; which I cannot read, but there are differences between the two
> invocations of fprintf.

Seems that you accidentally attached the object file (binary) instead of
the assembly output. :-)

> The weirdest thing is that actually the value is the integer 123. So it
> should be stored in any kind of register exactly.

123 as a normalized binary floating point number would be

1.921875 × 2⁶

and you would have to represent the mantissa in binary (1.111011₂).
Still this should fit into most floating point registers easily.

But possibly the value computed by the algorithm of parsing the number
resulted in e.g.

1.111011000000000000000000000000000000000000000000000001011₂ × 2⁶

which is still 123 when converted back to a 52-bit mantissa, but
something ever so slightly larger when treated as an 80-bit float with
64 bits mantissa.

Or maybe I'm missing something :-)


Regards,


Michael


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Procps in core-updates
  2023-03-20 20:02     ` Michael Schierl
@ 2023-03-20 21:25       ` Andreas Enge
  0 siblings, 0 replies; 9+ messages in thread
From: Andreas Enge @ 2023-03-20 21:25 UTC (permalink / raw)
  To: Michael Schierl; +Cc: guix-devel

Am Mon, Mar 20, 2023 at 09:02:01PM +0100 schrieb Michael Schierl:
> Seems that you accidentally attached the object file (binary) instead of
> the assembly output. :-)

Hm, I had accidentally written the assembly output to a .o file, and then
apparently accidentally overwritten the file before attaching it :-(

> > The weirdest thing is that actually the value is the integer 123. So it
> > should be stored in any kind of register exactly.
> 123 as a normalized binary floating point number would be
> 1.921875 × 2⁶
> and you would have to represent the mantissa in binary (1.111011₂).
> Still this should fit into most floating point registers easily.

Exactly! 7 bits of mantissa would be enough.

> But possibly the value computed by the algorithm of parsing the number
> resulted in e.g.
> 1.111011000000000000000000000000000000000000000000000001011₂ × 2⁶

I looked at the algorithm, it uses a Horner scheme
((1*10)+2)*10+3, so everything should fit in every step and be fine.

I mentioned the problem to colleagues, who told me "this is the well-known
GCC bug 323":
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
And they pointed me to even crazier things, errors even after casting
floating point numbers to integers in cases where gcc "optimises" the code:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102930
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85957
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93681
Enjoy the output
z = 0
z is one
coming from
    int z = opaque(1) + 0x1p-60 == 1;
    printf("z = %d\n", z);
    if (z)
        puts("z is one");
!
It looks like a quantum computer, z is a superposition of 0 and 1 :)

Andreas



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-03-20 21:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-18 12:32 Procps in core-updates Andreas Enge
2023-03-18 17:38 ` Leo Famulari
2023-03-18 17:44   ` Andreas Enge
2023-03-19 17:36 ` Michael Schierl
2023-03-20 14:54   ` Andreas Enge
2023-03-20 20:02     ` Michael Schierl
2023-03-20 21:25       ` Andreas Enge
2023-03-19 17:49 ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
2023-03-19 19:22   ` Kaelyn

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).