unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* Double-precision floating point arithmetics behaves differently in Guile and in C
@ 2021-09-26  8:45 Panicz Maciej Godek
  2021-09-26 11:55 ` Panicz Maciej Godek
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Panicz Maciej Godek @ 2021-09-26  8:45 UTC (permalink / raw)
  To: Guile User

Hi,

I recently ran into a very weird problem. I am using code for computing
Singular Value Decomposition that's been ported do Scheme by Gerald Sussman
as a part of the scmutils package (more specifically, I use some bits of
code extracted from guile-scmutils
<https://www.cs.rochester.edu/%7Egildea/guile-scmutils/>)

The code is too long to paste here, but I have uploaded a copy to my github
<https://github.com/panicz/slayer/blob/master/guile-modules/extra/scmutils/numerics/linear/svd.scm>
.

I have found an ellipsoid-fitting algorithm that leans on this SVD code:
when I run it on a randomly generated set of points lying on an ellipsoid,
the ellipsoid parameters are reconstructed correctly.

The trouble begins when I try to port that code to C. Initially, I have
used a C code for computing SVD that I've found elsewhere (which was based
on the same Fortran code from LINPACK), but I eventually wrote code to
convert Scheme code to C to make sure that both algorithms are identical
(and at this point I am quite confident that they are).

The problem is, that the results of floating-point operations behave
slightly differently in C than in Guile, and these small numerical
differences accumulate, and eventually start making huge differences -- at
least that's something that I've noticed. Moreover, I can quite confidently
say that the results that I get in C are incorrect -- that the *eigenvalues*
(which translate to the radii of the ellipsoid) are very different, the
calculated center of the ellipsoid is slightly different, and that the
*eigenvectors* (which can be interpreted as a rotation of the ellipsoid)
are very similar.

Either way, the ellipsoid isn't reconstructed correctly, because the points
used for reconstruction do not lie near that ellipsoid (unlike in the case
of the C code).

I am really troubled with this behavior -- according to the documentation,
and to the bits of source code that I managed to analyze, Guile uses
exactly the same representation of floating point numbers as the one I use
in C -- namely, the double type. (I only use single-precision floats to
express the input points, but I ran some tests using small integers on
inputs, to make sure, that they are exactly representable -- and the
results were still very different)

Is there a way to make sure that the floating point operations in C behave
identically as those in Guile?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Double-precision floating point arithmetics behaves differently in Guile and in C
  2021-09-26  8:45 Double-precision floating point arithmetics behaves differently in Guile and in C Panicz Maciej Godek
@ 2021-09-26 11:55 ` Panicz Maciej Godek
  2021-09-26 13:16   ` Eli Zaretskii
  2021-09-26 12:57 ` tomas
  2021-09-26 19:11 ` Matt Wette
  2 siblings, 1 reply; 8+ messages in thread
From: Panicz Maciej Godek @ 2021-09-26 11:55 UTC (permalink / raw)
  To: Guile User

I forgot to mention that I run Guile in an Emacs session running in a WSL
console on Windows 10.
The tests of the C code that I've been performing so far were executed in
an MSYS terminal, but I have just tried running them in WSL console, and
the radii get ridiculous values.

While the values used to generate the points are the following:

center '[-65.12 -50.54 88.66]:
radii: '[83.95 47.13 45.56]
rotation: '[[9.633871e-001   1.363818e-001   -2.308359e-001]
                [-1.734094e-001  9.735887e-001   -1.485064e-001]
                [2.044857e-001   1.830983e-001   9.615927e-001]]

and the values reconstructed in Guile are:

 (ellipsoid
  #:center (65.10194623013226 -88.58460232582514 -50.721825868168324)
  #:radii (83.94677717528019 45.56525864722978 47.12037877216948)
  #:rotation ((-0.9633227088329351 0.2307406879308479 0.13699669185777974)
                   (0.173638969876562 0.14674930301995573
0.9738142277679884)
                   (-0.20459439578586436 -0.961885324244193
0.18143251146545056)))

(the signs and the order of radii differ, but it seems that the rotation
matrix compensates for that)

in the case of MSYS (MinGW 64-bit) I get the following values:

approx_e.center = [90.13, -68.76, -42.46]
approx_e.radius = [178.83, 97.08, 100.43]
approx_e.rotation(3x3) =
        -9.634121049254586e-001 1.734020884333972e-001
 -2.043742444879810e-001
        -2.314281253565038e-001 -1.535574849495593e-001
9.606566096217422e-001
        -1.351966674037161e-001 -9.728061546592428e-001
-1.880692600613582e-001

but in the case of the WSL console, I get

approx_e.center = [90.13, -68.76, -42.46]
approx_e.radius = [1347844355973136335704668472606720.00,
731678657375689861259088782950400.00, 756961648396782369967223865344000.00]
approx_e.rotation(3x3) =
        -9.634121049254428e-01  1.734020884334219e-01
-2.043742444880351e-01
        -2.314281253566119e-01  -1.535574849499134e-01
 9.606566096216593e-01
        -1.351966674036449e-01  -9.728061546591822e-01
 -1.880692600617213e-01


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Double-precision floating point arithmetics behaves differently in Guile and in C
  2021-09-26  8:45 Double-precision floating point arithmetics behaves differently in Guile and in C Panicz Maciej Godek
  2021-09-26 11:55 ` Panicz Maciej Godek
@ 2021-09-26 12:57 ` tomas
  2021-09-26 19:11 ` Matt Wette
  2 siblings, 0 replies; 8+ messages in thread
From: tomas @ 2021-09-26 12:57 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 609 bytes --]

On Sun, Sep 26, 2021 at 10:45:51AM +0200, Panicz Maciej Godek wrote:
> Hi,
> 
> I recently ran into a very weird problem. I am using code for computing
> Singular Value Decomposition that's been ported do Scheme by Gerald Sussman
> as a part of the scmutils package (more specifically, I use some bits of
> code extracted from guile-scmutils

Hm. A couple of things to check would be: your C code is using "double"
floats (the Guile manual says the inexacts are `doubles')? Are your Guile
(what's its version anyway) and C program compiled for the same platform
(i.e. word width)?

Cheers
 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Double-precision floating point arithmetics behaves differently in Guile and in C
  2021-09-26 11:55 ` Panicz Maciej Godek
@ 2021-09-26 13:16   ` Eli Zaretskii
  2021-09-26 16:48     ` Panicz Maciej Godek
  0 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2021-09-26 13:16 UTC (permalink / raw)
  To: Panicz Maciej Godek; +Cc: guile-user

> From: Panicz Maciej Godek <godek.maciek@gmail.com>
> Date: Sun, 26 Sep 2021 13:55:20 +0200
> 
> I forgot to mention that I run Guile in an Emacs session running in a WSL
> console on Windows 10.
> The tests of the C code that I've been performing so far were executed in
> an MSYS terminal, but I have just tried running them in WSL console, and
> the radii get ridiculous values.
> 
> While the values used to generate the points are the following:
> 
> center '[-65.12 -50.54 88.66]:
> radii: '[83.95 47.13 45.56]
> rotation: '[[9.633871e-001   1.363818e-001   -2.308359e-001]
>                 [-1.734094e-001  9.735887e-001   -1.485064e-001]
>                 [2.044857e-001   1.830983e-001   9.615927e-001]]
> 
> and the values reconstructed in Guile are:
> 
>  (ellipsoid
>   #:center (65.10194623013226 -88.58460232582514 -50.721825868168324)
>   #:radii (83.94677717528019 45.56525864722978 47.12037877216948)
>   #:rotation ((-0.9633227088329351 0.2307406879308479 0.13699669185777974)
>                    (0.173638969876562 0.14674930301995573
> 0.9738142277679884)
>                    (-0.20459439578586436 -0.961885324244193
> 0.18143251146545056)))
> 
> (the signs and the order of radii differ, but it seems that the rotation
> matrix compensates for that)
> 
> in the case of MSYS (MinGW 64-bit) I get the following values:
> 
> approx_e.center = [90.13, -68.76, -42.46]
> approx_e.radius = [178.83, 97.08, 100.43]
> approx_e.rotation(3x3) =
>         -9.634121049254586e-001 1.734020884333972e-001
>  -2.043742444879810e-001
>         -2.314281253565038e-001 -1.535574849495593e-001
> 9.606566096217422e-001
>         -1.351966674037161e-001 -9.728061546592428e-001
> -1.880692600613582e-001
> 
> but in the case of the WSL console, I get
> 
> approx_e.center = [90.13, -68.76, -42.46]
> approx_e.radius = [1347844355973136335704668472606720.00,
> 731678657375689861259088782950400.00, 756961648396782369967223865344000.00]
> approx_e.rotation(3x3) =
>         -9.634121049254428e-01  1.734020884334219e-01
> -2.043742444880351e-01
>         -2.314281253566119e-01  -1.535574849499134e-01
>  9.606566096216593e-01
>         -1.351966674036449e-01  -9.728061546591822e-01
>  -1.880692600617213e-01

Forgive me for saying this, but frankly the most probable reason for
this is some bug in converting the Scheme code to C.  SVD is a
remarkably stable algorithm in the face of numerical roundoff errors,
that's one of its strong points.  So if the issue is with some minor
numerical inaccuracies, SVD results should be insensitive to it, as
long as the condition numbers of the matrices that you decompose are
not preposterously large, like 10^12 or something.  (Did you calculate
the condition numbers, btw? what are they?)

So I would check and recheck your C code, as IMO that's the first
suspect in this story.

Also, please note that on Intel CPUs double-precision FP calculations
usually keep intermediate results in 10-byte (80-bit) wide "long
double" format, unless your program is built with the GCC switch that
disables that (few programs do, so I don't expect you to do that,
eneither while building Guile, nor when compiling your C
implementation of SVD).  But that cannot by itself explain the
problem, because using 80-bit FP values should make the results more
accurate, not less accurate.

So again, I suspect your C program.  Does it work well when you throw
simple problems at it, like systems of linear equations where the
result is known in advance and the condition number of the matrix is
around 1?

HTH



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Double-precision floating point arithmetics behaves differently in Guile and in C
  2021-09-26 13:16   ` Eli Zaretskii
@ 2021-09-26 16:48     ` Panicz Maciej Godek
  0 siblings, 0 replies; 8+ messages in thread
From: Panicz Maciej Godek @ 2021-09-26 16:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Guile User

Thanks for the feedback!
It turned out that SVD was fine and I was looking in the wrong place
-- in the C counterpart, I was feeding my ellipsoid algorithm with
uninitialized junk,
and the compiler didn't shout at me.

Sorry for the noise!

niedz., 26 wrz 2021 o 15:16 Eli Zaretskii <eliz@gnu.org> napisał(a):

> > From: Panicz Maciej Godek <godek.maciek@gmail.com>
> > Date: Sun, 26 Sep 2021 13:55:20 +0200
> >
> > I forgot to mention that I run Guile in an Emacs session running in a WSL
> > console on Windows 10.
> > The tests of the C code that I've been performing so far were executed in
> > an MSYS terminal, but I have just tried running them in WSL console, and
> > the radii get ridiculous values.
> >
> > While the values used to generate the points are the following:
> >
> > center '[-65.12 -50.54 88.66]:
> > radii: '[83.95 47.13 45.56]
> > rotation: '[[9.633871e-001   1.363818e-001   -2.308359e-001]
> >                 [-1.734094e-001  9.735887e-001   -1.485064e-001]
> >                 [2.044857e-001   1.830983e-001   9.615927e-001]]
> >
> > and the values reconstructed in Guile are:
> >
> >  (ellipsoid
> >   #:center (65.10194623013226 -88.58460232582514 -50.721825868168324)
> >   #:radii (83.94677717528019 45.56525864722978 47.12037877216948)
> >   #:rotation ((-0.9633227088329351 0.2307406879308479
> 0.13699669185777974)
> >                    (0.173638969876562 0.14674930301995573
> > 0.9738142277679884)
> >                    (-0.20459439578586436 -0.961885324244193
> > 0.18143251146545056)))
> >
> > (the signs and the order of radii differ, but it seems that the rotation
> > matrix compensates for that)
> >
> > in the case of MSYS (MinGW 64-bit) I get the following values:
> >
> > approx_e.center = [90.13, -68.76, -42.46]
> > approx_e.radius = [178.83, 97.08, 100.43]
> > approx_e.rotation(3x3) =
> >         -9.634121049254586e-001 1.734020884333972e-001
> >  -2.043742444879810e-001
> >         -2.314281253565038e-001 -1.535574849495593e-001
> > 9.606566096217422e-001
> >         -1.351966674037161e-001 -9.728061546592428e-001
> > -1.880692600613582e-001
> >
> > but in the case of the WSL console, I get
> >
> > approx_e.center = [90.13, -68.76, -42.46]
> > approx_e.radius = [1347844355973136335704668472606720.00,
> > 731678657375689861259088782950400.00,
> 756961648396782369967223865344000.00]
> > approx_e.rotation(3x3) =
> >         -9.634121049254428e-01  1.734020884334219e-01
> > -2.043742444880351e-01
> >         -2.314281253566119e-01  -1.535574849499134e-01
> >  9.606566096216593e-01
> >         -1.351966674036449e-01  -9.728061546591822e-01
> >  -1.880692600617213e-01
>
> Forgive me for saying this, but frankly the most probable reason for
> this is some bug in converting the Scheme code to C.  SVD is a
> remarkably stable algorithm in the face of numerical roundoff errors,
> that's one of its strong points.  So if the issue is with some minor
> numerical inaccuracies, SVD results should be insensitive to it, as
> long as the condition numbers of the matrices that you decompose are
> not preposterously large, like 10^12 or something.  (Did you calculate
> the condition numbers, btw? what are they?)
>
> So I would check and recheck your C code, as IMO that's the first
> suspect in this story.
>
> Also, please note that on Intel CPUs double-precision FP calculations
> usually keep intermediate results in 10-byte (80-bit) wide "long
> double" format, unless your program is built with the GCC switch that
> disables that (few programs do, so I don't expect you to do that,
> eneither while building Guile, nor when compiling your C
> implementation of SVD).  But that cannot by itself explain the
> problem, because using 80-bit FP values should make the results more
> accurate, not less accurate.
>
> So again, I suspect your C program.  Does it work well when you throw
> simple problems at it, like systems of linear equations where the
> result is known in advance and the condition number of the matrix is
> around 1?
>
> HTH
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Double-precision floating point arithmetics behaves differently in Guile and in C
  2021-09-26  8:45 Double-precision floating point arithmetics behaves differently in Guile and in C Panicz Maciej Godek
  2021-09-26 11:55 ` Panicz Maciej Godek
  2021-09-26 12:57 ` tomas
@ 2021-09-26 19:11 ` Matt Wette
  2021-09-27 14:04   ` Panicz Maciej Godek
  2 siblings, 1 reply; 8+ messages in thread
From: Matt Wette @ 2021-09-26 19:11 UTC (permalink / raw)
  To: guile-user

Your post is a little vague.  Are you comparing singular values vectors 
of A vs
eigenvalues, eigenvectors of A'*A ?  Also, SVD is iterative and 
different algorithms may yield
different results (and some of the SVD algorithms out there are not great).
  Are you using the exact same algorithm but noticing that (op x y) in Guile
behaves differently than (x op y) in C?

On 9/26/21 1:45 AM, Panicz Maciej Godek wrote:
> Hi,
>
> I recently ran into a very weird problem. I am using code for computing
> Singular Value Decomposition that's been ported do Scheme by Gerald Sussman
> as a part of the scmutils package (more specifically, I use some bits of
> code extracted from guile-scmutils
> <https://www.cs.rochester.edu/%7Egildea/guile-scmutils/>)
>
> The code is too long to paste here, but I have uploaded a copy to my github
> <https://github.com/panicz/slayer/blob/master/guile-modules/extra/scmutils/numerics/linear/svd.scm>
> .
>
> I have found an ellipsoid-fitting algorithm that leans on this SVD code:
> when I run it on a randomly generated set of points lying on an ellipsoid,
> the ellipsoid parameters are reconstructed correctly.
>
> The trouble begins when I try to port that code to C. Initially, I have
> used a C code for computing SVD that I've found elsewhere (which was based
> on the same Fortran code from LINPACK), but I eventually wrote code to
> convert Scheme code to C to make sure that both algorithms are identical
> (and at this point I am quite confident that they are).
>
> The problem is, that the results of floating-point operations behave
> slightly differently in C than in Guile, and these small numerical
> differences accumulate, and eventually start making huge differences -- at
> least that's something that I've noticed. Moreover, I can quite confidently
> say that the results that I get in C are incorrect -- that the *eigenvalues*
> (which translate to the radii of the ellipsoid) are very different, the
> calculated center of the ellipsoid is slightly different, and that the
> *eigenvectors* (which can be interpreted as a rotation of the ellipsoid)
> are very similar.
>
> Either way, the ellipsoid isn't reconstructed correctly, because the points
> used for reconstruction do not lie near that ellipsoid (unlike in the case
> of the C code).
>
> I am really troubled with this behavior -- according to the documentation,
> and to the bits of source code that I managed to analyze, Guile uses
> exactly the same representation of floating point numbers as the one I use
> in C -- namely, the double type. (I only use single-precision floats to
> express the input points, but I ran some tests using small integers on
> inputs, to make sure, that they are exactly representable -- and the
> results were still very different)
>
> Is there a way to make sure that the floating point operations in C behave
> identically as those in Guile?




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Double-precision floating point arithmetics behaves differently in Guile and in C
  2021-09-26 19:11 ` Matt Wette
@ 2021-09-27 14:04   ` Panicz Maciej Godek
  2021-10-16  9:19     ` Maxime Devos
  0 siblings, 1 reply; 8+ messages in thread
From: Panicz Maciej Godek @ 2021-09-27 14:04 UTC (permalink / raw)
  To: Matt Wette; +Cc: Guile User

niedz., 26 wrz 2021 o 21:11 Matt Wette <matt.wette@gmail.com> napisał(a):

> Your post is a little vague.  Are you comparing singular values vectors
> of A vs
> eigenvalues, eigenvectors of A'*A ?  Also, SVD is iterative and
> different algorithms may yield
> different results (and some of the SVD algorithms out there are not great).
>   Are you using the exact same algorithm but noticing that (op x y) in
> Guile
> behaves differently than (x op y) in C?
>
>
Yes.
I mean, I have used exactly the same algorithm (obtained by mechanically
transforming the scmutils code into C), and I've been looking at its
operation, and noticed that the results of floating point operations
differed between Guile and C for the same input data.

However, it turned out that it wasn't those differences that caused the
trouble, and I currently believe that the numerical differences can be
tolerable for my application.


>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Double-precision floating point arithmetics behaves differently in Guile and in C
  2021-09-27 14:04   ` Panicz Maciej Godek
@ 2021-10-16  9:19     ` Maxime Devos
  0 siblings, 0 replies; 8+ messages in thread
From: Maxime Devos @ 2021-10-16  9:19 UTC (permalink / raw)
  To: Panicz Maciej Godek, Matt Wette; +Cc: Guile User

Panicz Maciej Godek schreef op ma 27-09-2021 om 16:04 [+0200]:
> niedz., 26 wrz 2021 o 21:11 Matt Wette <matt.wette@gmail.com>
> napisał(a):
> 
> > Your post is a little vague.  Are you comparing singular values
> > vectors
> > of A vs
> > eigenvalues, eigenvectors of A'*A ?  Also, SVD is iterative and
> > different algorithms may yield
> > different results (and some of the SVD algorithms out there are not
> > great).
> >   Are you using the exact same algorithm but noticing that (op x y)
> > in
> > Guile
> > behaves differently than (x op y) in C?
> > 
> > 
> Yes.
> I mean, I have used exactly the same algorithm (obtained by
> mechanically
> transforming the scmutils code into C), and I've been looking at its
> operation, and noticed that the results of floating point operations
> differed between Guile and C for the same input data.

If this is on x86,
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fexcess-precision
might be relevant.

Greetings,
Maxime.




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-16  9:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-26  8:45 Double-precision floating point arithmetics behaves differently in Guile and in C Panicz Maciej Godek
2021-09-26 11:55 ` Panicz Maciej Godek
2021-09-26 13:16   ` Eli Zaretskii
2021-09-26 16:48     ` Panicz Maciej Godek
2021-09-26 12:57 ` tomas
2021-09-26 19:11 ` Matt Wette
2021-09-27 14:04   ` Panicz Maciej Godek
2021-10-16  9:19     ` Maxime Devos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).