* bug#14141: Abort in RTL VM
@ 2013-04-04 19:22 Noah Lavine
2013-04-04 19:44 ` Noah Lavine
0 siblings, 1 reply; 5+ messages in thread
From: Noah Lavine @ 2013-04-04 19:22 UTC (permalink / raw)
To: 14141
[-- Attachment #1: Type: text/plain, Size: 4922 bytes --]
Hello,
I'm actually testing on the wip-rtl-cps branch, but this error involves
code that I believe is the same on that branch and on the wip-rtl branch.
Try opening a new Guile and doing the following:
scheme@(guile-user)> (use-modules (system vm rtl))
scheme@(guile-user)> (assemble-program '((begin-program foo)
(assert-nargs-ge 0)
(reserve-locals 4)
(bind-rest 0)
(box 1 0)
(cache-current-module! 2 foo)
(cached-toplevel-ref 2 foo car)
(box-ref 3 1)
(mov 0 3)
(tail-call 1 2)
(end-program)))
... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0>
scheme@(guile-user)> ($1 'hello)
The expected result is
$2 = hello
What I actually get is,
Program received signal SIGABRT, Aborted.
0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
The full backtrace is below. The interesting part is that it seems to be
tripping the check at libguile/vm-engine.c:1868, which checks whether an
object is a variable before doing a box-ref on it. When I look at it in
GDB, it seems that whatever is at register 1 does not satisfy
scm_variable_p, although I'm not very experienced with debugging Guile.
However, I am somewhat surprised at this, because I have used boxes and
box-ref before in the past with no trouble.
Another surprising thing is that if I open Guile, do some other things for
a while, and then run this code, the problem sometimes doesn't appear. That
is especially disturbing.
Does anyone have any idea where the issue is or how I should find it?
Thanks,
Noah
Here's the backtrace:
(gdb) bt
#0 0x00007ffff7440425 in raise ()
from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff7443b8b in abort ()
from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
#3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
#4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
#5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
at eval.c:691
#6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
module_or_state=0x7e0090) at eval.c:725
#7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
at script.c:441
#8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
argv=0x7fffffffe478) at guile.c:62
#9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
at init.c:336
#10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
at continuations.c:513
#11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
args=0x304) at throw.c:146
#12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
at smob.c:141
#13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
at vm-i-system.c:873
#14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
#15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
#16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
key=0x404, thunk=0x81b360, handler=0x81b340,
pre_unwind_handler=0x81b320) at throw.c:86
#17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
pre_unwind_handler_data=0x751160) at throw.c:213
#18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
pre_unwind_handler_data=0x751160) at continuations.c:451
#19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
at continuations.c:547
#20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
base@entry=<error reading variable: value has been optimized out>,
data=0x7fffffffe2d0,
data@entry=<error reading variable: value has been optimized out>)
at threads.c:907
#21 0x00007ffff71b6f55 in GC_call_with_stack_base (
fn=<optimized out>, arg=<optimized out>) at misc.c:1553
#22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
parent=0x0) at threads.c:950
#23 0x00007ffff7af98c0 in scm_with_guile (
func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
at threads.c:956
#24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
at init.c:319
#25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
at guile.c:81
[-- Attachment #2: Type: text/html, Size: 5580 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#14141: Abort in RTL VM
2013-04-04 19:22 bug#14141: Abort in RTL VM Noah Lavine
@ 2013-04-04 19:44 ` Noah Lavine
2013-04-05 17:29 ` Noah Lavine
0 siblings, 1 reply; 5+ messages in thread
From: Noah Lavine @ 2013-04-04 19:44 UTC (permalink / raw)
To: 14141
[-- Attachment #1: Type: text/plain, Size: 5585 bytes --]
Oh, I forgot to mention one important fact. I *do* get the expected result
if I eliminate the stuff with boxes. This works fine:
scheme@(guile-user)> (assemble-program '((begin-program foo)
(assert-nargs-ge 0)
(reserve-locals 4)
(bind-rest 0)
(cache-current-module! 2 foo)
(cached-toplevel-ref 2 foo car)
(tail-call 1 2)
(end-program)))
Best,
Noah
On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote:
> Hello,
>
> I'm actually testing on the wip-rtl-cps branch, but this error involves
> code that I believe is the same on that branch and on the wip-rtl branch.
> Try opening a new Guile and doing the following:
>
> scheme@(guile-user)> (use-modules (system vm rtl))
> scheme@(guile-user)> (assemble-program '((begin-program foo)
> (assert-nargs-ge 0)
> (reserve-locals 4)
> (bind-rest 0)
> (box 1 0)
> (cache-current-module! 2 foo)
> (cached-toplevel-ref 2 foo car)
> (box-ref 3 1)
> (mov 0 3)
> (tail-call 1 2)
> (end-program)))
> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0>
> scheme@(guile-user)> ($1 'hello)
>
> The expected result is
> $2 = hello
>
> What I actually get is,
>
> Program received signal SIGABRT, Aborted.
> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
>
> The full backtrace is below. The interesting part is that it seems to be
> tripping the check at libguile/vm-engine.c:1868, which checks whether an
> object is a variable before doing a box-ref on it. When I look at it in
> GDB, it seems that whatever is at register 1 does not satisfy
> scm_variable_p, although I'm not very experienced with debugging Guile.
> However, I am somewhat surprised at this, because I have used boxes and
> box-ref before in the past with no trouble.
>
> Another surprising thing is that if I open Guile, do some other things for
> a while, and then run this code, the problem sometimes doesn't appear. That
> is especially disturbing.
>
> Does anyone have any idea where the issue is or how I should find it?
>
> Thanks,
> Noah
>
> Here's the backtrace:
>
> (gdb) bt
> #0 0x00007ffff7440425 in raise ()
> from /lib/x86_64-linux-gnu/libc.so.6
> #1 0x00007ffff7443b8b in abort ()
> from /lib/x86_64-linux-gnu/libc.so.6
> #2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
> program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
> #3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
> program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
> #4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
> program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
> #5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
> at eval.c:691
> #6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
> module_or_state=0x7e0090) at eval.c:725
> #7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
> at script.c:441
> #8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
> argv=0x7fffffffe478) at guile.c:62
> #9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
> at init.c:336
> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
> at continuations.c:513
> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
> args=0x304) at throw.c:146
> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
> at smob.c:141
> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
> program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
> at vm-i-system.c:873
> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
> program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
> arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
> key=0x404, thunk=0x81b360, handler=0x81b340,
> pre_unwind_handler=0x81b320) at throw.c:86
> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
> pre_unwind_handler_data=0x751160) at throw.c:213
> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
> pre_unwind_handler_data=0x751160) at continuations.c:451
> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
> at continuations.c:547
> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
> base@entry=<error reading variable: value has been optimized out>,
> data=0x7fffffffe2d0,
> data@entry=<error reading variable: value has been optimized out>)
> at threads.c:907
> #21 0x00007ffff71b6f55 in GC_call_with_stack_base (
> fn=<optimized out>, arg=<optimized out>) at misc.c:1553
> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
> parent=0x0) at threads.c:950
> #23 0x00007ffff7af98c0 in scm_with_guile (
> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
> at threads.c:956
> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
> argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
> at init.c:319
> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
> at guile.c:81
>
>
>
[-- Attachment #2: Type: text/html, Size: 6576 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#14141: Abort in RTL VM
2013-04-04 19:44 ` Noah Lavine
@ 2013-04-05 17:29 ` Noah Lavine
2013-04-06 14:49 ` Stefan Israelsson Tampe
0 siblings, 1 reply; 5+ messages in thread
From: Noah Lavine @ 2013-04-05 17:29 UTC (permalink / raw)
To: Noah Lavine; +Cc: 14141
[-- Attachment #1: Type: text/plain, Size: 6435 bytes --]
Hello,
Just a quick update - it seems to be related to the order of the
reserve-locals and bind-rest calls. If I reverse those, the problem goes
away. However, I still don't know why this happens, and why the problem
doesn't happen when the variables aren't boxed.
I think there might be something weird going on in bind-rest when the
argument is zero. It has a loop like this:
while (nargs-- > dst) { ... }.
When dst is zero, doesn't nargs end up getting set to -1? (Which, since
it's unsigned, is really 2^32 - 1.) That might make any later instructions
that use nargs (like reserve-locals) do odd things.
Noah
On Thu, Apr 4, 2013 at 3:44 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote:
> Oh, I forgot to mention one important fact. I *do* get the expected result
> if I eliminate the stuff with boxes. This works fine:
>
>
> scheme@(guile-user)> (assemble-program '((begin-program foo)
> (assert-nargs-ge 0)
> (reserve-locals 4)
> (bind-rest 0)
> (cache-current-module! 2 foo)
> (cached-toplevel-ref 2 foo car)
> (tail-call 1 2)
> (end-program)))
>
> Best,
> Noah
>
> On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com>wrote:
>
>> Hello,
>>
>> I'm actually testing on the wip-rtl-cps branch, but this error involves
>> code that I believe is the same on that branch and on the wip-rtl branch.
>> Try opening a new Guile and doing the following:
>>
>> scheme@(guile-user)> (use-modules (system vm rtl))
>> scheme@(guile-user)> (assemble-program '((begin-program foo)
>> (assert-nargs-ge 0)
>> (reserve-locals 4)
>> (bind-rest 0)
>> (box 1 0)
>> (cache-current-module! 2 foo)
>> (cached-toplevel-ref 2 foo car)
>> (box-ref 3 1)
>> (mov 0 3)
>> (tail-call 1 2)
>> (end-program)))
>> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0>
>> scheme@(guile-user)> ($1 'hello)
>>
>> The expected result is
>> $2 = hello
>>
>> What I actually get is,
>>
>> Program received signal SIGABRT, Aborted.
>> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
>>
>> The full backtrace is below. The interesting part is that it seems to be
>> tripping the check at libguile/vm-engine.c:1868, which checks whether an
>> object is a variable before doing a box-ref on it. When I look at it in
>> GDB, it seems that whatever is at register 1 does not satisfy
>> scm_variable_p, although I'm not very experienced with debugging Guile.
>> However, I am somewhat surprised at this, because I have used boxes and
>> box-ref before in the past with no trouble.
>>
>> Another surprising thing is that if I open Guile, do some other things
>> for a while, and then run this code, the problem sometimes doesn't appear.
>> That is especially disturbing.
>>
>> Does anyone have any idea where the issue is or how I should find it?
>>
>> Thanks,
>> Noah
>>
>> Here's the backtrace:
>>
>> (gdb) bt
>> #0 0x00007ffff7440425 in raise ()
>> from /lib/x86_64-linux-gnu/libc.so.6
>> #1 0x00007ffff7443b8b in abort ()
>> from /lib/x86_64-linux-gnu/libc.so.6
>> #2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
>> program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
>> #3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
>> program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
>> #4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>> program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
>> #5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
>> at eval.c:691
>> #6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
>> module_or_state=0x7e0090) at eval.c:725
>> #7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
>> at script.c:441
>> #8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
>> argv=0x7fffffffe478) at guile.c:62
>> #9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
>> at init.c:336
>> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
>> at continuations.c:513
>> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
>> args=0x304) at throw.c:146
>> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
>> at smob.c:141
>> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
>> program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
>> at vm-i-system.c:873
>> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>> program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
>> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
>> arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
>> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
>> key=0x404, thunk=0x81b360, handler=0x81b340,
>> pre_unwind_handler=0x81b320) at throw.c:86
>> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>> pre_unwind_handler_data=0x751160) at throw.c:213
>> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>> pre_unwind_handler_data=0x751160) at continuations.c:451
>> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>> at continuations.c:547
>> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
>> base@entry=<error reading variable: value has been optimized out>,
>> data=0x7fffffffe2d0,
>> data@entry=<error reading variable: value has been optimized out>)
>> at threads.c:907
>> #21 0x00007ffff71b6f55 in GC_call_with_stack_base (
>> fn=<optimized out>, arg=<optimized out>) at misc.c:1553
>> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
>> parent=0x0) at threads.c:950
>> #23 0x00007ffff7af98c0 in scm_with_guile (
>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>> at threads.c:956
>> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
>> argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
>> at init.c:319
>> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
>> at guile.c:81
>>
>>
>>
>
[-- Attachment #2: Type: text/html, Size: 7820 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#14141: Abort in RTL VM
2013-04-05 17:29 ` Noah Lavine
@ 2013-04-06 14:49 ` Stefan Israelsson Tampe
2013-04-12 17:50 ` Noah Lavine
0 siblings, 1 reply; 5+ messages in thread
From: Stefan Israelsson Tampe @ 2013-04-06 14:49 UTC (permalink / raw)
To: Noah Lavine; +Cc: 14141
Yeah, you really found the problem.
I would put a if(nargs) to guard the while just to make it more robust.
/Stefan
On Fri, Apr 5, 2013 at 7:29 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote:
> Hello,
>
> Just a quick update - it seems to be related to the order of the
> reserve-locals and bind-rest calls. If I reverse those, the problem goes
> away. However, I still don't know why this happens, and why the problem
> doesn't happen when the variables aren't boxed.
>
> I think there might be something weird going on in bind-rest when the
> argument is zero. It has a loop like this:
>
> while (nargs-- > dst) { ... }.
>
> When dst is zero, doesn't nargs end up getting set to -1? (Which, since it's
> unsigned, is really 2^32 - 1.) That might make any later instructions that
> use nargs (like reserve-locals) do odd things.
>
> Noah
>
>
> On Thu, Apr 4, 2013 at 3:44 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote:
>>
>> Oh, I forgot to mention one important fact. I *do* get the expected result
>> if I eliminate the stuff with boxes. This works fine:
>>
>>
>> scheme@(guile-user)> (assemble-program '((begin-program foo)
>> (assert-nargs-ge 0)
>> (reserve-locals 4)
>> (bind-rest 0)
>> (cache-current-module! 2 foo)
>> (cached-toplevel-ref 2 foo car)
>> (tail-call 1 2)
>> (end-program)))
>>
>> Best,
>> Noah
>>
>> On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> I'm actually testing on the wip-rtl-cps branch, but this error involves
>>> code that I believe is the same on that branch and on the wip-rtl branch.
>>> Try opening a new Guile and doing the following:
>>>
>>> scheme@(guile-user)> (use-modules (system vm rtl))
>>> scheme@(guile-user)> (assemble-program '((begin-program foo)
>>> (assert-nargs-ge 0)
>>> (reserve-locals 4)
>>> (bind-rest 0)
>>> (box 1 0)
>>> (cache-current-module! 2 foo)
>>> (cached-toplevel-ref 2 foo car)
>>> (box-ref 3 1)
>>> (mov 0 3)
>>> (tail-call 1 2)
>>> (end-program)))
>>> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0>
>>> scheme@(guile-user)> ($1 'hello)
>>>
>>> The expected result is
>>> $2 = hello
>>>
>>> What I actually get is,
>>>
>>> Program received signal SIGABRT, Aborted.
>>> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
>>>
>>> The full backtrace is below. The interesting part is that it seems to be
>>> tripping the check at libguile/vm-engine.c:1868, which checks whether an
>>> object is a variable before doing a box-ref on it. When I look at it in GDB,
>>> it seems that whatever is at register 1 does not satisfy scm_variable_p,
>>> although I'm not very experienced with debugging Guile. However, I am
>>> somewhat surprised at this, because I have used boxes and box-ref before in
>>> the past with no trouble.
>>>
>>> Another surprising thing is that if I open Guile, do some other things
>>> for a while, and then run this code, the problem sometimes doesn't appear.
>>> That is especially disturbing.
>>>
>>> Does anyone have any idea where the issue is or how I should find it?
>>>
>>> Thanks,
>>> Noah
>>>
>>> Here's the backtrace:
>>>
>>> (gdb) bt
>>> #0 0x00007ffff7440425 in raise ()
>>> from /lib/x86_64-linux-gnu/libc.so.6
>>> #1 0x00007ffff7443b8b in abort ()
>>> from /lib/x86_64-linux-gnu/libc.so.6
>>> #2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
>>> program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
>>> #3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
>>> program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
>>> #4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>>> program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
>>> #5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
>>> at eval.c:691
>>> #6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
>>> module_or_state=0x7e0090) at eval.c:725
>>> #7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
>>> at script.c:441
>>> #8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
>>> argv=0x7fffffffe478) at guile.c:62
>>> #9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
>>> at init.c:336
>>> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
>>> at continuations.c:513
>>> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
>>> args=0x304) at throw.c:146
>>> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
>>> at smob.c:141
>>> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
>>> program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
>>> at vm-i-system.c:873
>>> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>>> program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
>>> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
>>> arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
>>> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
>>> key=0x404, thunk=0x81b360, handler=0x81b340,
>>> pre_unwind_handler=0x81b320) at throw.c:86
>>> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
>>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>>> pre_unwind_handler_data=0x751160) at throw.c:213
>>> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
>>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>>> pre_unwind_handler_data=0x751160) at continuations.c:451
>>> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
>>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>>> at continuations.c:547
>>> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
>>> base@entry=<error reading variable: value has been optimized out>,
>>> data=0x7fffffffe2d0,
>>> data@entry=<error reading variable: value has been optimized out>)
>>> at threads.c:907
>>> #21 0x00007ffff71b6f55 in GC_call_with_stack_base (
>>> fn=<optimized out>, arg=<optimized out>) at misc.c:1553
>>> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
>>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
>>> parent=0x0) at threads.c:950
>>> #23 0x00007ffff7af98c0 in scm_with_guile (
>>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>>> at threads.c:956
>>> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
>>> argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
>>> at init.c:319
>>> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
>>> at guile.c:81
>>>
>>>
>>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#14141: Abort in RTL VM
2013-04-06 14:49 ` Stefan Israelsson Tampe
@ 2013-04-12 17:50 ` Noah Lavine
0 siblings, 0 replies; 5+ messages in thread
From: Noah Lavine @ 2013-04-12 17:50 UTC (permalink / raw)
To: Stefan Israelsson Tampe; +Cc: 14141
[-- Attachment #1: Type: text/plain, Size: 8239 bytes --]
Case closed! (At least for now.)
Although the bug in reserve-locals is real (you can check with the
debugger), the program never actually got far enough for that to affect
anything. Instead, here's what happened:
- reserve-locals worked fine, reserving space for 4 local variables
- bind-rest shrunk the stack back down, leaving enough space for only one
local variable
- the call to toplevel-ref invoked the old (non-RTL) VM, through line 479
of modules.c.
- the old VM put its initial frame on the stack right after the stack
pointer - but since the stack pointer had been decremented by bind-rest,
that overwrote the 4 local variables in the RTL function. In particular, it
overwrote the variable that held the box (fp[1], for the record).
- after the old VM returned, the new VM continued, tried to use the
incorrect fp[1] value, and aborted.
So, mystery solved. Coming soon: patches to fix the bug you hit if you try
to do reserve-locals after bind-rest ...
Noah
On Sat, Apr 6, 2013 at 10:49 AM, Stefan Israelsson Tampe <
stefan.itampe@gmail.com> wrote:
> Yeah, you really found the problem.
>
> I would put a if(nargs) to guard the while just to make it more robust.
>
> /Stefan
>
>
> On Fri, Apr 5, 2013 at 7:29 PM, Noah Lavine <noah.b.lavine@gmail.com>
> wrote:
> > Hello,
> >
> > Just a quick update - it seems to be related to the order of the
> > reserve-locals and bind-rest calls. If I reverse those, the problem goes
> > away. However, I still don't know why this happens, and why the problem
> > doesn't happen when the variables aren't boxed.
> >
> > I think there might be something weird going on in bind-rest when the
> > argument is zero. It has a loop like this:
> >
> > while (nargs-- > dst) { ... }.
> >
> > When dst is zero, doesn't nargs end up getting set to -1? (Which, since
> it's
> > unsigned, is really 2^32 - 1.) That might make any later instructions
> that
> > use nargs (like reserve-locals) do odd things.
> >
> > Noah
> >
> >
> > On Thu, Apr 4, 2013 at 3:44 PM, Noah Lavine <noah.b.lavine@gmail.com>
> wrote:
> >>
> >> Oh, I forgot to mention one important fact. I *do* get the expected
> result
> >> if I eliminate the stuff with boxes. This works fine:
> >>
> >>
> >> scheme@(guile-user)> (assemble-program '((begin-program foo)
> >> (assert-nargs-ge 0)
> >> (reserve-locals 4)
> >> (bind-rest 0)
> >> (cache-current-module! 2 foo)
> >> (cached-toplevel-ref 2 foo car)
> >> (tail-call 1 2)
> >> (end-program)))
> >>
> >> Best,
> >> Noah
> >>
> >> On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com>
> >> wrote:
> >>>
> >>> Hello,
> >>>
> >>> I'm actually testing on the wip-rtl-cps branch, but this error involves
> >>> code that I believe is the same on that branch and on the wip-rtl
> branch.
> >>> Try opening a new Guile and doing the following:
> >>>
> >>> scheme@(guile-user)> (use-modules (system vm rtl))
> >>> scheme@(guile-user)> (assemble-program '((begin-program foo)
> >>> (assert-nargs-ge 0)
> >>> (reserve-locals 4)
> >>> (bind-rest 0)
> >>> (box 1 0)
> >>> (cache-current-module! 2 foo)
> >>> (cached-toplevel-ref 2 foo car)
> >>> (box-ref 3 1)
> >>> (mov 0 3)
> >>> (tail-call 1 2)
> >>> (end-program)))
> >>> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90
> 609bc0>
> >>> scheme@(guile-user)> ($1 'hello)
> >>>
> >>> The expected result is
> >>> $2 = hello
> >>>
> >>> What I actually get is,
> >>>
> >>> Program received signal SIGABRT, Aborted.
> >>> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> >>>
> >>> The full backtrace is below. The interesting part is that it seems to
> be
> >>> tripping the check at libguile/vm-engine.c:1868, which checks whether
> an
> >>> object is a variable before doing a box-ref on it. When I look at it
> in GDB,
> >>> it seems that whatever is at register 1 does not satisfy
> scm_variable_p,
> >>> although I'm not very experienced with debugging Guile. However, I am
> >>> somewhat surprised at this, because I have used boxes and box-ref
> before in
> >>> the past with no trouble.
> >>>
> >>> Another surprising thing is that if I open Guile, do some other things
> >>> for a while, and then run this code, the problem sometimes doesn't
> appear.
> >>> That is especially disturbing.
> >>>
> >>> Does anyone have any idea where the issue is or how I should find it?
> >>>
> >>> Thanks,
> >>> Noah
> >>>
> >>> Here's the backtrace:
> >>>
> >>> (gdb) bt
> >>> #0 0x00007ffff7440425 in raise ()
> >>> from /lib/x86_64-linux-gnu/libc.so.6
> >>> #1 0x00007ffff7443b8b in abort ()
> >>> from /lib/x86_64-linux-gnu/libc.so.6
> >>> #2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
> >>> program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
> >>> #3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
> >>> program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
> >>> #4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
> >>> program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
> >>> #5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
> >>> at eval.c:691
> >>> #6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
> >>> module_or_state=0x7e0090) at eval.c:725
> >>> #7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
> >>> at script.c:441
> >>> #8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
> >>> argv=0x7fffffffe478) at guile.c:62
> >>> #9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
> >>> at init.c:336
> >>> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
> >>> at continuations.c:513
> >>> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
> >>> args=0x304) at throw.c:146
> >>> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
> >>> at smob.c:141
> >>> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
> >>> program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
> >>> at vm-i-system.c:873
> >>> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
> >>> program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
> >>> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
> >>> arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
> >>> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
> >>> key=0x404, thunk=0x81b360, handler=0x81b340,
> >>> pre_unwind_handler=0x81b320) at throw.c:86
> >>> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
> >>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
> >>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
> >>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
> >>> pre_unwind_handler_data=0x751160) at throw.c:213
> >>> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
> >>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
> >>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
> >>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
> >>> pre_unwind_handler_data=0x751160) at continuations.c:451
> >>> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
> >>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
> >>> at continuations.c:547
> >>> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
> >>> base@entry=<error reading variable: value has been optimized out>,
> >>> data=0x7fffffffe2d0,
> >>> data@entry=<error reading variable: value has been optimized out>)
> >>> at threads.c:907
> >>> #21 0x00007ffff71b6f55 in GC_call_with_stack_base (
> >>> fn=<optimized out>, arg=<optimized out>) at misc.c:1553
> >>> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
> >>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
> >>> parent=0x0) at threads.c:950
> >>> #23 0x00007ffff7af98c0 in scm_with_guile (
> >>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
> >>> at threads.c:956
> >>> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
> >>> argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
> >>> at init.c:319
> >>> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
> >>> at guile.c:81
> >>>
> >>>
> >>
> >
>
[-- Attachment #2: Type: text/html, Size: 10836 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-04-12 17:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-04 19:22 bug#14141: Abort in RTL VM Noah Lavine
2013-04-04 19:44 ` Noah Lavine
2013-04-05 17:29 ` Noah Lavine
2013-04-06 14:49 ` Stefan Israelsson Tampe
2013-04-12 17:50 ` Noah Lavine
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).