* how does one debug a SEGV in scm_threads_prehistory?
@ 2008-06-18 16:37 Bruce Korb
2008-06-18 19:29 ` Greg Troxel
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Bruce Korb @ 2008-06-18 16:37 UTC (permalink / raw)
To: guile-devel
Our main development server was "upgraded" to 64 bits, but mostly
still runs 32 bit software,
so this is from a 32 bit build on a 64 bit platform. Naturally, this
all works on 32 on 32 and
on 64 on 64. But with 32 on 64, not so well:
Program received signal SIGSEGV, Segmentation fault.
0xb7f482fe in scm_threads_prehistory () from /usr/lib/libguile.so.17
(gdb) bt
#0 0xb7f482fe in scm_threads_prehistory () from /usr/lib/libguile.so.17
#1 0xb7f4834b in scm_i_thread_sleep_for_gc () from /usr/lib/libguile.so.17
#2 0xb7f48375 in scm_i_thread_put_to_sleep () from /usr/lib/libguile.so.17
#3 0xb7f2f3e4 in scm_i_string_writable_chars () from /usr/lib/libguile.so.17
#4 0xb7f2f53d in scm_c_string_set_x () from /usr/lib/libguile.so.17
#5 0xb7f23839 in scm_read_token () from /usr/lib/libguile.so.17
#6 0xb7f23fcc in scm_lreadr () from /usr/lib/libguile.so.17
#7 0xb7f23c87 in scm_lreadrecparen () from /usr/lib/libguile.so.17
#8 0xb7f245d8 in scm_lreadr () from /usr/lib/libguile.so.17
#9 0xb7f23cfd in scm_lreadrecparen () from /usr/lib/libguile.so.17
#10 0xb7f245d8 in scm_lreadr () from /usr/lib/libguile.so.17
#11 0xb7f24c43 in scm_read () from /usr/lib/libguile.so.17
#12 0x08056634 in ag_scm_c_eval_string_from_file_line (
pzExpr=0x8075ce0 "(if (> (string-length shell-cleanup) 0) (shell
shell-cleanup) )", pzFile=0x8075cc8 "../../agen5/autogen.c", line=237)
at ../../agen5/expGuile.c:113
#13 0x0804e03c in doneCheck () at ../../agen5/autogen.c:236
#14 0xb7d053b0 in exit () from /lib/tls/i686/cmov/libc.so.6
#15 0x0804dd56 in inner_main (argc=4, argv=0xbfb38274)
at ../../agen5/autogen.c:90
#16 0xb7ef9f72 in gh_enter () from /usr/lib/libguile.so.17
#17 0xb7f06224 in scm_boot_guile () from /usr/lib/libguile.so.17
#18 0xb7ed8bc2 in scm_char_upcase () from /usr/lib/libguile.so.17
#19 0xb7f4a025 in scm_c_catch () from /usr/lib/libguile.so.17
#20 0xb7ed9107 in scm_i_with_continuation_barrier ()
from /usr/lib/libguile.so.17
#21 0xb7ed91e1 in scm_c_with_continuation_barrier ()
from /usr/lib/libguile.so.17
#22 0xb7f49309 in scm_i_with_guile_and_parent () from /usr/lib/libguile.so.17
#23 0xb7f4935e in scm_with_guile () from /usr/lib/libguile.so.17
#24 0xb7f061bf in scm_boot_guile () from /usr/lib/libguile.so.17
#25 0xb7ef9f45 in gh_enter () from /usr/lib/libguile.so.17
#26 0x0804de00 in main (argc=4, argv=0xbfb38274) at ../../agen5/autogen.c:115
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: how does one debug a SEGV in scm_threads_prehistory?
2008-06-18 16:37 how does one debug a SEGV in scm_threads_prehistory? Bruce Korb
@ 2008-06-18 19:29 ` Greg Troxel
2008-06-18 20:25 ` Bruce Korb
2008-06-18 19:56 ` math tests on amd64 Greg Troxel
2008-06-19 13:14 ` how does one debug a SEGV in scm_threads_prehistory? Neil Jerram
2 siblings, 1 reply; 9+ messages in thread
From: Greg Troxel @ 2008-06-18 19:29 UTC (permalink / raw)
To: Bruce Korb; +Cc: guile-devel
Our main development server was "upgraded" to 64 bits, but mostly
still runs 32 bit software,
so this is from a 32 bit build on a 64 bit platform. Naturally, this
all works on 32 on 32 and
on 64 on 64. But with 32 on 64, not so well:
I presume you are talking about Linux and going from i386 to
amd64/x86_64 (not sure which name is used in Linux). I would be
suspicious that the i386 binary is getting linked with some amd64 libs
somehow - to first order it would seem to be an OS bug if the emulated
i386 binary doesn't run the same as on i386. But I could certainly see
an allowable difference leading to triggering a latent bug in guile.
Can you trigger this with a simple example? I have i386 and now amd64
boxes, and would be curious to try the NetBSD i386 binary on amd64, as
well as native.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: how does one debug a SEGV in scm_threads_prehistory?
2008-06-18 19:29 ` Greg Troxel
@ 2008-06-18 20:25 ` Bruce Korb
2008-06-19 18:09 ` Greg Troxel
0 siblings, 1 reply; 9+ messages in thread
From: Bruce Korb @ 2008-06-18 20:25 UTC (permalink / raw)
To: Greg Troxel; +Cc: guile-devel
On Wed, Jun 18, 2008 at 12:29 PM, Greg Troxel <gdt@ir.bbn.com> wrote:
> Our main development server was "upgraded" to 64 bits, but mostly
> still runs 32 bit software,
> so this is from a 32 bit build on a 64 bit platform. Naturally, this
> all works on 32 on 32 and
> on 64 on 64. But with 32 on 64, not so well:
>
> I presume you are talking about Linux and going from i386 to
> amd64/x86_64 (not sure which name is used in Linux). I would be
It depends on which Linux: SuSE is x86_64 and Debian amd64.
Unclear why the first name needed to be rethought. Whatever.
> suspicious that the i386 binary is getting linked with some amd64 libs
ldd showed which libguile, and it was the 32 bit flavor. (I did go there, too.
It's just hard to recount all the rabbit trails I've gone down....)
> somehow - to first order it would seem to be an OS bug if the emulated
> i386 binary doesn't run the same as on i386. But I could certainly see
> an allowable difference leading to triggering a latent bug in guile.
>
> Can you trigger this with a simple example? I have i386 and now amd64
> boxes, and would be curious to try the NetBSD i386 binary on amd64, as
> well as native.
"simple example" is always a stumbling block. Especially when it is happening
in libguile (which is not simple) and triggered by my app (which is as large
as Guile is). I'll see what I can do. Probably next week or so.
Thanks - Bruce
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: how does one debug a SEGV in scm_threads_prehistory?
2008-06-18 20:25 ` Bruce Korb
@ 2008-06-19 18:09 ` Greg Troxel
0 siblings, 0 replies; 9+ messages in thread
From: Greg Troxel @ 2008-06-19 18:09 UTC (permalink / raw)
To: Bruce Korb; +Cc: guile-devel
"Bruce Korb" <bkorb@gnu.org> writes:
> It depends on which Linux: SuSE is x86_64 and Debian amd64.
> Unclear why the first name needed to be rethought. Whatever.
Thanks - I don't know why NetBSD chose amd64 either.
>> Can you trigger this with a simple example? I have i386 and now amd64
>> boxes, and would be curious to try the NetBSD i386 binary on amd64, as
>> well as native.
>
> "simple example" is always a stumbling block. Especially when it is
> happening in libguile (which is not simple) and triggered by my app
> (which is as large as Guile is). I'll see what I can do. Probably
> next week or so.
I installed NetBSD/i386 guile on NetBSD/amd64, but apparently the
XEN3_DOMU kernel doesn't have i386 emulation. I'll perhaps add it, or
find another box.
^ permalink raw reply [flat|nested] 9+ messages in thread
* math tests on amd64
2008-06-18 16:37 how does one debug a SEGV in scm_threads_prehistory? Bruce Korb
2008-06-18 19:29 ` Greg Troxel
@ 2008-06-18 19:56 ` Greg Troxel
2008-06-19 13:18 ` Neil Jerram
2008-06-20 17:44 ` Marijn Schouten (hkBst)
2008-06-19 13:14 ` how does one debug a SEGV in scm_threads_prehistory? Neil Jerram
2 siblings, 2 replies; 9+ messages in thread
From: Greg Troxel @ 2008-06-18 19:56 UTC (permalink / raw)
To: guile-devel
I ran 'make check' on guile on NetBSD-current on amd64, and got a
failure:
guile> (version)
"1.8.5"
guile> (eqv? 0.0 (exp -inf.0))
#f
guile> (exp -inf.0)
+nan.0
On i386 and sparc64 this passes. I ran paranoia and that passed (with
not even flaws). Does this test pass for others on other operating
systems on amd64? If so, I should look into whether it's a NetBSD bug,
variance in floating point flags, or a guile bug.
So far this looks like a failure of NetBSD on amd64 to follow POSIX:
http://www.opengroup.org/onlinepubs/009695399/functions/exp.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: math tests on amd64
2008-06-18 19:56 ` math tests on amd64 Greg Troxel
@ 2008-06-19 13:18 ` Neil Jerram
2008-06-19 18:03 ` Greg Troxel
2008-06-20 17:44 ` Marijn Schouten (hkBst)
1 sibling, 1 reply; 9+ messages in thread
From: Neil Jerram @ 2008-06-19 13:18 UTC (permalink / raw)
To: Greg Troxel; +Cc: guile-devel
2008/6/18 Greg Troxel <gdt@ir.bbn.com>:
> I ran 'make check' on guile on NetBSD-current on amd64, and got a
> failure:
>
> guile> (version)
> "1.8.5"
> guile> (eqv? 0.0 (exp -inf.0))
> #f
> guile> (exp -inf.0)
> +nan.0
>
> On i386 and sparc64 this passes. I ran paranoia and that passed (with
> not even flaws). Does this test pass for others on other operating
> systems on amd64?
I'm afraid I have no data there.
Mathematically, it's correct for (exp -inf.0) to be 0, isn't it?
Regards,
Neil
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: math tests on amd64
2008-06-19 13:18 ` Neil Jerram
@ 2008-06-19 18:03 ` Greg Troxel
0 siblings, 0 replies; 9+ messages in thread
From: Greg Troxel @ 2008-06-19 18:03 UTC (permalink / raw)
To: Neil Jerram; +Cc: guile-devel
"Neil Jerram" <neiljerram@googlemail.com> writes:
>> guile> (version)
>> "1.8.5"
>> guile> (eqv? 0.0 (exp -inf.0))
>> #f
>> guile> (exp -inf.0)
>> +nan.0
>>
>> On i386 and sparc64 this passes. I ran paranoia and that passed (with
>> not even flaws). Does this test pass for others on other operating
>> systems on amd64?
>
> Mathematically, it's correct for (exp -inf.0) to be 0, isn't it?
I have since found that POSIX.1 specifies exp(3) the way guile is
supposed to behave, and that function from C on NetBSD/amd64 also fails.
So it's a NetBSD/amd64 problem, not a guile problem - sorry for the
noise.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: math tests on amd64
2008-06-18 19:56 ` math tests on amd64 Greg Troxel
2008-06-19 13:18 ` Neil Jerram
@ 2008-06-20 17:44 ` Marijn Schouten (hkBst)
1 sibling, 0 replies; 9+ messages in thread
From: Marijn Schouten (hkBst) @ 2008-06-20 17:44 UTC (permalink / raw)
To: Greg Troxel; +Cc: guile-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Greg Troxel wrote:
> I ran 'make check' on guile on NetBSD-current on amd64, and got a
> failure:
>
> guile> (version)
> "1.8.5"
> guile> (eqv? 0.0 (exp -inf.0))
> #f
> guile> (exp -inf.0)
> +nan.0
On my amd64 linux system I get:
guile> (version)
"1.8.5"
guile> (eqv? 0.0 (exp -inf.0))
#t
guile> (exp -inf.0)
0.0
HTH, Marijn
> On i386 and sparc64 this passes. I ran paranoia and that passed (with
> not even flaws). Does this test pass for others on other operating
> systems on amd64? If so, I should look into whether it's a NetBSD bug,
> variance in floating point flags, or a guile bug.
>
> So far this looks like a failure of NetBSD on amd64 to follow POSIX:
> http://www.opengroup.org/onlinepubs/009695399/functions/exp.html
- --
Marijn Schouten (hkBst), Gentoo Lisp project, Gentoo ML
<http://www.gentoo.org/proj/en/lisp/>, #gentoo-{lisp,ml} on FreeNode
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkhb7IQACgkQp/VmCx0OL2z5YwCgjemUCb08v/nz4UZS4A/+7t0c
cVoAnjqrdEEfF07V/9YtSMgiyf5tSlQO
=YN+l
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: how does one debug a SEGV in scm_threads_prehistory?
2008-06-18 16:37 how does one debug a SEGV in scm_threads_prehistory? Bruce Korb
2008-06-18 19:29 ` Greg Troxel
2008-06-18 19:56 ` math tests on amd64 Greg Troxel
@ 2008-06-19 13:14 ` Neil Jerram
2 siblings, 0 replies; 9+ messages in thread
From: Neil Jerram @ 2008-06-19 13:14 UTC (permalink / raw)
To: Bruce Korb; +Cc: guile-devel
2008/6/18 Bruce Korb <bkorb@gnu.org>:
> Our main development server was "upgraded" to 64 bits, but mostly
> still runs 32 bit software,
> so this is from a 32 bit build on a 64 bit platform. Naturally, this
> all works on 32 on 32 and
> on 64 on 64. But with 32 on 64, not so well:
>
> Program received signal SIGSEGV, Segmentation fault.
> 0xb7f482fe in scm_threads_prehistory () from /usr/lib/libguile.so.17
> (gdb) bt
> #0 0xb7f482fe in scm_threads_prehistory () from /usr/lib/libguile.so.17
> #1 0xb7f4834b in scm_i_thread_sleep_for_gc () from /usr/lib/libguile.so.17
> #2 0xb7f48375 in scm_i_thread_put_to_sleep () from /usr/lib/libguile.so.17
> #3 0xb7f2f3e4 in scm_i_string_writable_chars () from /usr/lib/libguile.so.17
> #4 0xb7f2f53d in scm_c_string_set_x () from /usr/lib/libguile.so.17
> #5 0xb7f23839 in scm_read_token () from /usr/lib/libguile.so.17
[...]
There's definitely something very odd here, because
scm_threads_prehistory() should only be called as part of Guile's
start of day (once-only) initialization.
You could focus on debugging why this happens, using GDB: set a
breakpoint on scm_threads_prehistory, then work out who calls it and
why. I doubt that will reveal the underlying problem here, but it may
give you (us) a clue about it.
Regards,
Neil
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-06-20 17:44 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-18 16:37 how does one debug a SEGV in scm_threads_prehistory? Bruce Korb
2008-06-18 19:29 ` Greg Troxel
2008-06-18 20:25 ` Bruce Korb
2008-06-19 18:09 ` Greg Troxel
2008-06-18 19:56 ` math tests on amd64 Greg Troxel
2008-06-19 13:18 ` Neil Jerram
2008-06-19 18:03 ` Greg Troxel
2008-06-20 17:44 ` Marijn Schouten (hkBst)
2008-06-19 13:14 ` how does one debug a SEGV in scm_threads_prehistory? Neil Jerram
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).