unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#14141: Abort in RTL VM
@ 2013-04-04 19:22 Noah Lavine
  2013-04-04 19:44 ` Noah Lavine
  0 siblings, 1 reply; 5+ messages in thread
From: Noah Lavine @ 2013-04-04 19:22 UTC (permalink / raw)
  To: 14141

[-- Attachment #1: Type: text/plain, Size: 4922 bytes --]

Hello,

I'm actually testing on the wip-rtl-cps branch, but this error involves
code that I believe is the same on that branch and on the wip-rtl branch.
Try opening a new Guile and doing the following:

scheme@(guile-user)> (use-modules (system vm rtl))
scheme@(guile-user)> (assemble-program '((begin-program foo)
 (assert-nargs-ge 0)
 (reserve-locals 4)
 (bind-rest 0)
 (box 1 0)
 (cache-current-module! 2 foo)
 (cached-toplevel-ref 2 foo car)
 (box-ref 3 1)
 (mov 0 3)
 (tail-call 1 2)
 (end-program)))
... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0>
scheme@(guile-user)> ($1 'hello)

The expected result is
$2 = hello

What I actually get is,

Program received signal SIGABRT, Aborted.
0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6

The full backtrace is below. The interesting part is that it seems to be
tripping the check at libguile/vm-engine.c:1868, which checks whether an
object is a variable before doing a box-ref on it. When I look at it in
GDB, it seems that whatever is at register 1 does not satisfy
scm_variable_p, although I'm not very experienced with debugging Guile.
However, I am somewhat surprised at this, because I have used boxes and
box-ref before in the past with no trouble.

Another surprising thing is that if I open Guile, do some other things for
a while, and then run this code, the problem sometimes doesn't appear. That
is especially disturbing.

Does anyone have any idea where the issue is or how I should find it?

Thanks,
Noah

Here's the backtrace:

(gdb) bt
#0  0x00007ffff7440425 in raise ()
   from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7443b8b in abort ()
   from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
    program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
#3  0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
    program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
#4  0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
    program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
#5  0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
    at eval.c:691
#6  0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
    module_or_state=0x7e0090) at eval.c:725
#7  0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
    at script.c:441
#8  0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
    argv=0x7fffffffe478) at guile.c:62
#9  0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
    at init.c:336
#10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
    at continuations.c:513
#11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
    args=0x304) at throw.c:146
#12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
    at smob.c:141
#13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
    program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
    at vm-i-system.c:873
#14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
    program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
#15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
    arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
#16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
    key=0x404, thunk=0x81b360, handler=0x81b340,
    pre_unwind_handler=0x81b320) at throw.c:86
#17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
    body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
    handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
    pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
    pre_unwind_handler_data=0x751160) at throw.c:213
#18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
    body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
    handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
    pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
    pre_unwind_handler_data=0x751160) at continuations.c:451
#19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
    func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
    at continuations.c:547
#20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
    base@entry=<error reading variable: value has been optimized out>,
data=0x7fffffffe2d0,
    data@entry=<error reading variable: value has been optimized out>)
    at threads.c:907
#21 0x00007ffff71b6f55 in GC_call_with_stack_base (
    fn=<optimized out>, arg=<optimized out>) at misc.c:1553
#22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
    func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
    parent=0x0) at threads.c:950
#23 0x00007ffff7af98c0 in scm_with_guile (
    func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
    at threads.c:956
#24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
    argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
    at init.c:319
#25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
    at guile.c:81

[-- Attachment #2: Type: text/html, Size: 5580 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#14141: Abort in RTL VM
  2013-04-04 19:22 bug#14141: Abort in RTL VM Noah Lavine
@ 2013-04-04 19:44 ` Noah Lavine
  2013-04-05 17:29   ` Noah Lavine
  0 siblings, 1 reply; 5+ messages in thread
From: Noah Lavine @ 2013-04-04 19:44 UTC (permalink / raw)
  To: 14141

[-- Attachment #1: Type: text/plain, Size: 5585 bytes --]

Oh, I forgot to mention one important fact. I *do* get the expected result
if I eliminate the stuff with boxes. This works fine:

scheme@(guile-user)> (assemble-program '((begin-program foo)
 (assert-nargs-ge 0)
 (reserve-locals 4)
 (bind-rest 0)
 (cache-current-module! 2 foo)
 (cached-toplevel-ref 2 foo car)
 (tail-call 1 2)
 (end-program)))

Best,
Noah

On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote:

> Hello,
>
> I'm actually testing on the wip-rtl-cps branch, but this error involves
> code that I believe is the same on that branch and on the wip-rtl branch.
> Try opening a new Guile and doing the following:
>
> scheme@(guile-user)> (use-modules (system vm rtl))
> scheme@(guile-user)> (assemble-program '((begin-program foo)
>  (assert-nargs-ge 0)
>  (reserve-locals 4)
>  (bind-rest 0)
>  (box 1 0)
>  (cache-current-module! 2 foo)
>  (cached-toplevel-ref 2 foo car)
>  (box-ref 3 1)
>  (mov 0 3)
>  (tail-call 1 2)
>  (end-program)))
> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0>
> scheme@(guile-user)> ($1 'hello)
>
> The expected result is
> $2 = hello
>
> What I actually get is,
>
> Program received signal SIGABRT, Aborted.
> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
>
> The full backtrace is below. The interesting part is that it seems to be
> tripping the check at libguile/vm-engine.c:1868, which checks whether an
> object is a variable before doing a box-ref on it. When I look at it in
> GDB, it seems that whatever is at register 1 does not satisfy
> scm_variable_p, although I'm not very experienced with debugging Guile.
> However, I am somewhat surprised at this, because I have used boxes and
> box-ref before in the past with no trouble.
>
> Another surprising thing is that if I open Guile, do some other things for
> a while, and then run this code, the problem sometimes doesn't appear. That
> is especially disturbing.
>
> Does anyone have any idea where the issue is or how I should find it?
>
> Thanks,
> Noah
>
> Here's the backtrace:
>
> (gdb) bt
> #0  0x00007ffff7440425 in raise ()
>    from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x00007ffff7443b8b in abort ()
>    from /lib/x86_64-linux-gnu/libc.so.6
> #2  0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
>     program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
> #3  0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
>     program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
> #4  0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>     program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
> #5  0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
>     at eval.c:691
> #6  0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
>     module_or_state=0x7e0090) at eval.c:725
> #7  0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
>     at script.c:441
> #8  0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
>     argv=0x7fffffffe478) at guile.c:62
> #9  0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
>     at init.c:336
> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
>     at continuations.c:513
> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
>     args=0x304) at throw.c:146
> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
>     at smob.c:141
> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
>     program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
>     at vm-i-system.c:873
> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>     program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
>     arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
>     key=0x404, thunk=0x81b360, handler=0x81b340,
>     pre_unwind_handler=0x81b320) at throw.c:86
> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
>     body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>     handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>     pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>     pre_unwind_handler_data=0x751160) at throw.c:213
> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
>     body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>     handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>     pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>     pre_unwind_handler_data=0x751160) at continuations.c:451
> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>     at continuations.c:547
> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
>     base@entry=<error reading variable: value has been optimized out>,
> data=0x7fffffffe2d0,
>     data@entry=<error reading variable: value has been optimized out>)
>     at threads.c:907
> #21 0x00007ffff71b6f55 in GC_call_with_stack_base (
>     fn=<optimized out>, arg=<optimized out>) at misc.c:1553
> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
>     parent=0x0) at threads.c:950
> #23 0x00007ffff7af98c0 in scm_with_guile (
>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>     at threads.c:956
> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
>     argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
>     at init.c:319
> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
>     at guile.c:81
>
>
>

[-- Attachment #2: Type: text/html, Size: 6576 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#14141: Abort in RTL VM
  2013-04-04 19:44 ` Noah Lavine
@ 2013-04-05 17:29   ` Noah Lavine
  2013-04-06 14:49     ` Stefan Israelsson Tampe
  0 siblings, 1 reply; 5+ messages in thread
From: Noah Lavine @ 2013-04-05 17:29 UTC (permalink / raw)
  To: Noah Lavine; +Cc: 14141

[-- Attachment #1: Type: text/plain, Size: 6435 bytes --]

Hello,

Just a quick update - it seems to be related to the order of the
reserve-locals and bind-rest calls. If I reverse those, the problem goes
away. However, I still don't know why this happens, and why the problem
doesn't happen when the variables aren't boxed.

I think there might be something weird going on in bind-rest when the
argument is zero. It has a loop like this:

while (nargs-- > dst) { ... }.

When dst is zero, doesn't nargs end up getting set to -1? (Which, since
it's unsigned, is really 2^32 - 1.) That might make any later instructions
that use nargs (like reserve-locals) do odd things.

Noah


On Thu, Apr 4, 2013 at 3:44 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote:

> Oh, I forgot to mention one important fact. I *do* get the expected result
> if I eliminate the stuff with boxes. This works fine:
>
>
> scheme@(guile-user)> (assemble-program '((begin-program foo)
>  (assert-nargs-ge 0)
>  (reserve-locals 4)
>  (bind-rest 0)
>  (cache-current-module! 2 foo)
>  (cached-toplevel-ref 2 foo car)
>  (tail-call 1 2)
>  (end-program)))
>
> Best,
> Noah
>
> On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com>wrote:
>
>> Hello,
>>
>> I'm actually testing on the wip-rtl-cps branch, but this error involves
>> code that I believe is the same on that branch and on the wip-rtl branch.
>> Try opening a new Guile and doing the following:
>>
>> scheme@(guile-user)> (use-modules (system vm rtl))
>> scheme@(guile-user)> (assemble-program '((begin-program foo)
>>  (assert-nargs-ge 0)
>>  (reserve-locals 4)
>>  (bind-rest 0)
>>  (box 1 0)
>>  (cache-current-module! 2 foo)
>>  (cached-toplevel-ref 2 foo car)
>>  (box-ref 3 1)
>>  (mov 0 3)
>>  (tail-call 1 2)
>>  (end-program)))
>> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0>
>> scheme@(guile-user)> ($1 'hello)
>>
>> The expected result is
>> $2 = hello
>>
>> What I actually get is,
>>
>> Program received signal SIGABRT, Aborted.
>> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
>>
>> The full backtrace is below. The interesting part is that it seems to be
>> tripping the check at libguile/vm-engine.c:1868, which checks whether an
>> object is a variable before doing a box-ref on it. When I look at it in
>> GDB, it seems that whatever is at register 1 does not satisfy
>> scm_variable_p, although I'm not very experienced with debugging Guile.
>> However, I am somewhat surprised at this, because I have used boxes and
>> box-ref before in the past with no trouble.
>>
>> Another surprising thing is that if I open Guile, do some other things
>> for a while, and then run this code, the problem sometimes doesn't appear.
>> That is especially disturbing.
>>
>> Does anyone have any idea where the issue is or how I should find it?
>>
>> Thanks,
>> Noah
>>
>> Here's the backtrace:
>>
>> (gdb) bt
>> #0  0x00007ffff7440425 in raise ()
>>    from /lib/x86_64-linux-gnu/libc.so.6
>> #1  0x00007ffff7443b8b in abort ()
>>    from /lib/x86_64-linux-gnu/libc.so.6
>> #2  0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
>>     program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
>> #3  0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
>>     program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
>> #4  0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>>     program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
>> #5  0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
>>     at eval.c:691
>> #6  0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
>>     module_or_state=0x7e0090) at eval.c:725
>> #7  0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
>>     at script.c:441
>> #8  0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
>>     argv=0x7fffffffe478) at guile.c:62
>> #9  0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
>>     at init.c:336
>> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
>>     at continuations.c:513
>> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
>>     args=0x304) at throw.c:146
>> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
>>     at smob.c:141
>> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
>>     program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
>>     at vm-i-system.c:873
>> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>>     program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
>> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
>>     arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
>> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
>>     key=0x404, thunk=0x81b360, handler=0x81b340,
>>     pre_unwind_handler=0x81b320) at throw.c:86
>> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
>>     body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>>     handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>>     pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>>     pre_unwind_handler_data=0x751160) at throw.c:213
>> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
>>     body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>>     handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>>     pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>>     pre_unwind_handler_data=0x751160) at continuations.c:451
>> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
>>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>>     at continuations.c:547
>> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
>>     base@entry=<error reading variable: value has been optimized out>,
>> data=0x7fffffffe2d0,
>>     data@entry=<error reading variable: value has been optimized out>)
>>     at threads.c:907
>> #21 0x00007ffff71b6f55 in GC_call_with_stack_base (
>>     fn=<optimized out>, arg=<optimized out>) at misc.c:1553
>> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
>>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
>>     parent=0x0) at threads.c:950
>> #23 0x00007ffff7af98c0 in scm_with_guile (
>>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>>     at threads.c:956
>> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
>>     argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
>>     at init.c:319
>> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
>>     at guile.c:81
>>
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 7820 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#14141: Abort in RTL VM
  2013-04-05 17:29   ` Noah Lavine
@ 2013-04-06 14:49     ` Stefan Israelsson Tampe
  2013-04-12 17:50       ` Noah Lavine
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Israelsson Tampe @ 2013-04-06 14:49 UTC (permalink / raw)
  To: Noah Lavine; +Cc: 14141

Yeah, you really found the problem.

I would put a if(nargs) to guard the while just to make it more robust.

/Stefan


On Fri, Apr 5, 2013 at 7:29 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote:
> Hello,
>
> Just a quick update - it seems to be related to the order of the
> reserve-locals and bind-rest calls. If I reverse those, the problem goes
> away. However, I still don't know why this happens, and why the problem
> doesn't happen when the variables aren't boxed.
>
> I think there might be something weird going on in bind-rest when the
> argument is zero. It has a loop like this:
>
> while (nargs-- > dst) { ... }.
>
> When dst is zero, doesn't nargs end up getting set to -1? (Which, since it's
> unsigned, is really 2^32 - 1.) That might make any later instructions that
> use nargs (like reserve-locals) do odd things.
>
> Noah
>
>
> On Thu, Apr 4, 2013 at 3:44 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote:
>>
>> Oh, I forgot to mention one important fact. I *do* get the expected result
>> if I eliminate the stuff with boxes. This works fine:
>>
>>
>> scheme@(guile-user)> (assemble-program '((begin-program foo)
>>  (assert-nargs-ge 0)
>>  (reserve-locals 4)
>>  (bind-rest 0)
>>  (cache-current-module! 2 foo)
>>  (cached-toplevel-ref 2 foo car)
>>  (tail-call 1 2)
>>  (end-program)))
>>
>> Best,
>> Noah
>>
>> On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> I'm actually testing on the wip-rtl-cps branch, but this error involves
>>> code that I believe is the same on that branch and on the wip-rtl branch.
>>> Try opening a new Guile and doing the following:
>>>
>>> scheme@(guile-user)> (use-modules (system vm rtl))
>>> scheme@(guile-user)> (assemble-program '((begin-program foo)
>>>  (assert-nargs-ge 0)
>>>  (reserve-locals 4)
>>>  (bind-rest 0)
>>>  (box 1 0)
>>>  (cache-current-module! 2 foo)
>>>  (cached-toplevel-ref 2 foo car)
>>>  (box-ref 3 1)
>>>  (mov 0 3)
>>>  (tail-call 1 2)
>>>  (end-program)))
>>> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0>
>>> scheme@(guile-user)> ($1 'hello)
>>>
>>> The expected result is
>>> $2 = hello
>>>
>>> What I actually get is,
>>>
>>> Program received signal SIGABRT, Aborted.
>>> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
>>>
>>> The full backtrace is below. The interesting part is that it seems to be
>>> tripping the check at libguile/vm-engine.c:1868, which checks whether an
>>> object is a variable before doing a box-ref on it. When I look at it in GDB,
>>> it seems that whatever is at register 1 does not satisfy scm_variable_p,
>>> although I'm not very experienced with debugging Guile. However, I am
>>> somewhat surprised at this, because I have used boxes and box-ref before in
>>> the past with no trouble.
>>>
>>> Another surprising thing is that if I open Guile, do some other things
>>> for a while, and then run this code, the problem sometimes doesn't appear.
>>> That is especially disturbing.
>>>
>>> Does anyone have any idea where the issue is or how I should find it?
>>>
>>> Thanks,
>>> Noah
>>>
>>> Here's the backtrace:
>>>
>>> (gdb) bt
>>> #0  0x00007ffff7440425 in raise ()
>>>    from /lib/x86_64-linux-gnu/libc.so.6
>>> #1  0x00007ffff7443b8b in abort ()
>>>    from /lib/x86_64-linux-gnu/libc.so.6
>>> #2  0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
>>>     program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
>>> #3  0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
>>>     program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
>>> #4  0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>>>     program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
>>> #5  0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
>>>     at eval.c:691
>>> #6  0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
>>>     module_or_state=0x7e0090) at eval.c:725
>>> #7  0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
>>>     at script.c:441
>>> #8  0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
>>>     argv=0x7fffffffe478) at guile.c:62
>>> #9  0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
>>>     at init.c:336
>>> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
>>>     at continuations.c:513
>>> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
>>>     args=0x304) at throw.c:146
>>> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
>>>     at smob.c:141
>>> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
>>>     program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
>>>     at vm-i-system.c:873
>>> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
>>>     program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
>>> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
>>>     arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
>>> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
>>>     key=0x404, thunk=0x81b360, handler=0x81b340,
>>>     pre_unwind_handler=0x81b320) at throw.c:86
>>> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
>>>     body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>>>     handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>>>     pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>>>     pre_unwind_handler_data=0x751160) at throw.c:213
>>> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
>>>     body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
>>>     handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
>>>     pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
>>>     pre_unwind_handler_data=0x751160) at continuations.c:451
>>> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
>>>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>>>     at continuations.c:547
>>> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
>>>     base@entry=<error reading variable: value has been optimized out>,
>>> data=0x7fffffffe2d0,
>>>     data@entry=<error reading variable: value has been optimized out>)
>>>     at threads.c:907
>>> #21 0x00007ffff71b6f55 in GC_call_with_stack_base (
>>>     fn=<optimized out>, arg=<optimized out>) at misc.c:1553
>>> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
>>>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
>>>     parent=0x0) at threads.c:950
>>> #23 0x00007ffff7af98c0 in scm_with_guile (
>>>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
>>>     at threads.c:956
>>> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
>>>     argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
>>>     at init.c:319
>>> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
>>>     at guile.c:81
>>>
>>>
>>
>





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#14141: Abort in RTL VM
  2013-04-06 14:49     ` Stefan Israelsson Tampe
@ 2013-04-12 17:50       ` Noah Lavine
  0 siblings, 0 replies; 5+ messages in thread
From: Noah Lavine @ 2013-04-12 17:50 UTC (permalink / raw)
  To: Stefan Israelsson Tampe; +Cc: 14141

[-- Attachment #1: Type: text/plain, Size: 8239 bytes --]

Case closed! (At least for now.)

Although the bug in reserve-locals is real (you can check with the
debugger), the program never actually got far enough for that to affect
anything. Instead, here's what happened:

 - reserve-locals worked fine, reserving space for 4 local variables
 - bind-rest shrunk the stack back down, leaving enough space for only one
local variable
 - the call to toplevel-ref invoked the old (non-RTL) VM, through line 479
of modules.c.
 - the old VM put its initial frame on the stack right after the stack
pointer - but since the stack pointer had been decremented by bind-rest,
that overwrote the 4 local variables in the RTL function. In particular, it
overwrote the variable that held the box (fp[1], for the record).
 - after the old VM returned, the new VM continued, tried to use the
incorrect fp[1] value, and aborted.

So, mystery solved. Coming soon: patches to fix the bug you hit if you try
to do reserve-locals after bind-rest ...

Noah


On Sat, Apr 6, 2013 at 10:49 AM, Stefan Israelsson Tampe <
stefan.itampe@gmail.com> wrote:

> Yeah, you really found the problem.
>
> I would put a if(nargs) to guard the while just to make it more robust.
>
> /Stefan
>
>
> On Fri, Apr 5, 2013 at 7:29 PM, Noah Lavine <noah.b.lavine@gmail.com>
> wrote:
> > Hello,
> >
> > Just a quick update - it seems to be related to the order of the
> > reserve-locals and bind-rest calls. If I reverse those, the problem goes
> > away. However, I still don't know why this happens, and why the problem
> > doesn't happen when the variables aren't boxed.
> >
> > I think there might be something weird going on in bind-rest when the
> > argument is zero. It has a loop like this:
> >
> > while (nargs-- > dst) { ... }.
> >
> > When dst is zero, doesn't nargs end up getting set to -1? (Which, since
> it's
> > unsigned, is really 2^32 - 1.) That might make any later instructions
> that
> > use nargs (like reserve-locals) do odd things.
> >
> > Noah
> >
> >
> > On Thu, Apr 4, 2013 at 3:44 PM, Noah Lavine <noah.b.lavine@gmail.com>
> wrote:
> >>
> >> Oh, I forgot to mention one important fact. I *do* get the expected
> result
> >> if I eliminate the stuff with boxes. This works fine:
> >>
> >>
> >> scheme@(guile-user)> (assemble-program '((begin-program foo)
> >>  (assert-nargs-ge 0)
> >>  (reserve-locals 4)
> >>  (bind-rest 0)
> >>  (cache-current-module! 2 foo)
> >>  (cached-toplevel-ref 2 foo car)
> >>  (tail-call 1 2)
> >>  (end-program)))
> >>
> >> Best,
> >> Noah
> >>
> >> On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com>
> >> wrote:
> >>>
> >>> Hello,
> >>>
> >>> I'm actually testing on the wip-rtl-cps branch, but this error involves
> >>> code that I believe is the same on that branch and on the wip-rtl
> branch.
> >>> Try opening a new Guile and doing the following:
> >>>
> >>> scheme@(guile-user)> (use-modules (system vm rtl))
> >>> scheme@(guile-user)> (assemble-program '((begin-program foo)
> >>>  (assert-nargs-ge 0)
> >>>  (reserve-locals 4)
> >>>  (bind-rest 0)
> >>>  (box 1 0)
> >>>  (cache-current-module! 2 foo)
> >>>  (cached-toplevel-ref 2 foo car)
> >>>  (box-ref 3 1)
> >>>  (mov 0 3)
> >>>  (tail-call 1 2)
> >>>  (end-program)))
> >>> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90
> 609bc0>
> >>> scheme@(guile-user)> ($1 'hello)
> >>>
> >>> The expected result is
> >>> $2 = hello
> >>>
> >>> What I actually get is,
> >>>
> >>> Program received signal SIGABRT, Aborted.
> >>> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> >>>
> >>> The full backtrace is below. The interesting part is that it seems to
> be
> >>> tripping the check at libguile/vm-engine.c:1868, which checks whether
> an
> >>> object is a variable before doing a box-ref on it. When I look at it
> in GDB,
> >>> it seems that whatever is at register 1 does not satisfy
> scm_variable_p,
> >>> although I'm not very experienced with debugging Guile. However, I am
> >>> somewhat surprised at this, because I have used boxes and box-ref
> before in
> >>> the past with no trouble.
> >>>
> >>> Another surprising thing is that if I open Guile, do some other things
> >>> for a while, and then run this code, the problem sometimes doesn't
> appear.
> >>> That is especially disturbing.
> >>>
> >>> Does anyone have any idea where the issue is or how I should find it?
> >>>
> >>> Thanks,
> >>> Noah
> >>>
> >>> Here's the backtrace:
> >>>
> >>> (gdb) bt
> >>> #0  0x00007ffff7440425 in raise ()
> >>>    from /lib/x86_64-linux-gnu/libc.so.6
> >>> #1  0x00007ffff7443b8b in abort ()
> >>>    from /lib/x86_64-linux-gnu/libc.so.6
> >>> #2  0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860,
> >>>     program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868
> >>> #3  0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860,
> >>>     program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419
> >>> #4  0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
> >>>     program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791
> >>> #5  0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0)
> >>>     at eval.c:691
> >>> #6  0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0,
> >>>     module_or_state=0x7e0090) at eval.c:725
> >>> #7  0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478)
> >>>     at script.c:441
> >>> #8  0x0000000000400bd0 in inner_main (closure=0x0, argc=1,
> >>>     argv=0x7fffffffe478) at guile.c:62
> >>> #9  0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350)
> >>>     at init.c:336
> >>> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220)
> >>>     at continuations.c:513
> >>> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360,
> >>>     args=0x304) at throw.c:146
> >>> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304)
> >>>     at smob.c:141
> >>> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860,
> >>>     program=0x7443e0, argv=0x7fffffffe0c0, nargs=2)
> >>>     at vm-i-system.c:873
> >>> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860,
> >>>     program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791
> >>> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404,
> >>>     arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513
> >>> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler (
> >>>     key=0x404, thunk=0x81b360, handler=0x81b340,
> >>>     pre_unwind_handler=0x81b320) at throw.c:86
> >>> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404,
> >>>     body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
> >>>     handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
> >>>     pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
> >>>     pre_unwind_handler_data=0x751160) at throw.c:213
> >>> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier (
> >>>     body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220,
> >>>     handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220,
> >>>     pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>,
> >>>     pre_unwind_handler_data=0x751160) at continuations.c:451
> >>> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier (
> >>>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
> >>>     at continuations.c:547
> >>> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290,
> >>>     base@entry=<error reading variable: value has been optimized out>,
> >>> data=0x7fffffffe2d0,
> >>>     data@entry=<error reading variable: value has been optimized out>)
> >>>     at threads.c:907
> >>> #21 0x00007ffff71b6f55 in GC_call_with_stack_base (
> >>>     fn=<optimized out>, arg=<optimized out>) at misc.c:1553
> >>> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent (
> >>>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350,
> >>>     parent=0x0) at threads.c:950
> >>> #23 0x00007ffff7af98c0 in scm_with_guile (
> >>>     func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350)
> >>>     at threads.c:956
> >>> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1,
> >>>     argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0)
> >>>     at init.c:319
> >>> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478)
> >>>     at guile.c:81
> >>>
> >>>
> >>
> >
>

[-- Attachment #2: Type: text/html, Size: 10836 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-04-12 17:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-04 19:22 bug#14141: Abort in RTL VM Noah Lavine
2013-04-04 19:44 ` Noah Lavine
2013-04-05 17:29   ` Noah Lavine
2013-04-06 14:49     ` Stefan Israelsson Tampe
2013-04-12 17:50       ` Noah Lavine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).