* bug#14141: Abort in RTL VM @ 2013-04-04 19:22 Noah Lavine 2013-04-04 19:44 ` Noah Lavine 0 siblings, 1 reply; 5+ messages in thread From: Noah Lavine @ 2013-04-04 19:22 UTC (permalink / raw) To: 14141 [-- Attachment #1: Type: text/plain, Size: 4922 bytes --] Hello, I'm actually testing on the wip-rtl-cps branch, but this error involves code that I believe is the same on that branch and on the wip-rtl branch. Try opening a new Guile and doing the following: scheme@(guile-user)> (use-modules (system vm rtl)) scheme@(guile-user)> (assemble-program '((begin-program foo) (assert-nargs-ge 0) (reserve-locals 4) (bind-rest 0) (box 1 0) (cache-current-module! 2 foo) (cached-toplevel-ref 2 foo car) (box-ref 3 1) (mov 0 3) (tail-call 1 2) (end-program))) ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0> scheme@(guile-user)> ($1 'hello) The expected result is $2 = hello What I actually get is, Program received signal SIGABRT, Aborted. 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 The full backtrace is below. The interesting part is that it seems to be tripping the check at libguile/vm-engine.c:1868, which checks whether an object is a variable before doing a box-ref on it. When I look at it in GDB, it seems that whatever is at register 1 does not satisfy scm_variable_p, although I'm not very experienced with debugging Guile. However, I am somewhat surprised at this, because I have used boxes and box-ref before in the past with no trouble. Another surprising thing is that if I open Guile, do some other things for a while, and then run this code, the problem sometimes doesn't appear. That is especially disturbing. Does anyone have any idea where the issue is or how I should find it? Thanks, Noah Here's the backtrace: (gdb) bt #0 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007ffff7443b8b in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860, program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868 #3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860, program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419 #4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791 #5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0) at eval.c:691 #6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0, module_or_state=0x7e0090) at eval.c:725 #7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478) at script.c:441 #8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1, argv=0x7fffffffe478) at guile.c:62 #9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350) at init.c:336 #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220) at continuations.c:513 #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360, args=0x304) at throw.c:146 #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304) at smob.c:141 #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860, program=0x7443e0, argv=0x7fffffffe0c0, nargs=2) at vm-i-system.c:873 #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791 #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404, arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513 #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler ( key=0x404, thunk=0x81b360, handler=0x81b340, pre_unwind_handler=0x81b320) at throw.c:86 #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404, body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, pre_unwind_handler_data=0x751160) at throw.c:213 #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier ( body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, pre_unwind_handler_data=0x751160) at continuations.c:451 #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier ( func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) at continuations.c:547 #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290, base@entry=<error reading variable: value has been optimized out>, data=0x7fffffffe2d0, data@entry=<error reading variable: value has been optimized out>) at threads.c:907 #21 0x00007ffff71b6f55 in GC_call_with_stack_base ( fn=<optimized out>, arg=<optimized out>) at misc.c:1553 #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent ( func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350, parent=0x0) at threads.c:950 #23 0x00007ffff7af98c0 in scm_with_guile ( func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) at threads.c:956 #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1, argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0) at init.c:319 #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478) at guile.c:81 [-- Attachment #2: Type: text/html, Size: 5580 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#14141: Abort in RTL VM 2013-04-04 19:22 bug#14141: Abort in RTL VM Noah Lavine @ 2013-04-04 19:44 ` Noah Lavine 2013-04-05 17:29 ` Noah Lavine 0 siblings, 1 reply; 5+ messages in thread From: Noah Lavine @ 2013-04-04 19:44 UTC (permalink / raw) To: 14141 [-- Attachment #1: Type: text/plain, Size: 5585 bytes --] Oh, I forgot to mention one important fact. I *do* get the expected result if I eliminate the stuff with boxes. This works fine: scheme@(guile-user)> (assemble-program '((begin-program foo) (assert-nargs-ge 0) (reserve-locals 4) (bind-rest 0) (cache-current-module! 2 foo) (cached-toplevel-ref 2 foo car) (tail-call 1 2) (end-program))) Best, Noah On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote: > Hello, > > I'm actually testing on the wip-rtl-cps branch, but this error involves > code that I believe is the same on that branch and on the wip-rtl branch. > Try opening a new Guile and doing the following: > > scheme@(guile-user)> (use-modules (system vm rtl)) > scheme@(guile-user)> (assemble-program '((begin-program foo) > (assert-nargs-ge 0) > (reserve-locals 4) > (bind-rest 0) > (box 1 0) > (cache-current-module! 2 foo) > (cached-toplevel-ref 2 foo car) > (box-ref 3 1) > (mov 0 3) > (tail-call 1 2) > (end-program))) > ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0> > scheme@(guile-user)> ($1 'hello) > > The expected result is > $2 = hello > > What I actually get is, > > Program received signal SIGABRT, Aborted. > 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 > > The full backtrace is below. The interesting part is that it seems to be > tripping the check at libguile/vm-engine.c:1868, which checks whether an > object is a variable before doing a box-ref on it. When I look at it in > GDB, it seems that whatever is at register 1 does not satisfy > scm_variable_p, although I'm not very experienced with debugging Guile. > However, I am somewhat surprised at this, because I have used boxes and > box-ref before in the past with no trouble. > > Another surprising thing is that if I open Guile, do some other things for > a while, and then run this code, the problem sometimes doesn't appear. That > is especially disturbing. > > Does anyone have any idea where the issue is or how I should find it? > > Thanks, > Noah > > Here's the backtrace: > > (gdb) bt > #0 0x00007ffff7440425 in raise () > from /lib/x86_64-linux-gnu/libc.so.6 > #1 0x00007ffff7443b8b in abort () > from /lib/x86_64-linux-gnu/libc.so.6 > #2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860, > program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868 > #3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860, > program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419 > #4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, > program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791 > #5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0) > at eval.c:691 > #6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0, > module_or_state=0x7e0090) at eval.c:725 > #7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478) > at script.c:441 > #8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1, > argv=0x7fffffffe478) at guile.c:62 > #9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350) > at init.c:336 > #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220) > at continuations.c:513 > #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360, > args=0x304) at throw.c:146 > #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304) > at smob.c:141 > #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860, > program=0x7443e0, argv=0x7fffffffe0c0, nargs=2) > at vm-i-system.c:873 > #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, > program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791 > #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404, > arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513 > #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler ( > key=0x404, thunk=0x81b360, handler=0x81b340, > pre_unwind_handler=0x81b320) at throw.c:86 > #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404, > body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, > handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, > pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, > pre_unwind_handler_data=0x751160) at throw.c:213 > #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier ( > body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, > handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, > pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, > pre_unwind_handler_data=0x751160) at continuations.c:451 > #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier ( > func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) > at continuations.c:547 > #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290, > base@entry=<error reading variable: value has been optimized out>, > data=0x7fffffffe2d0, > data@entry=<error reading variable: value has been optimized out>) > at threads.c:907 > #21 0x00007ffff71b6f55 in GC_call_with_stack_base ( > fn=<optimized out>, arg=<optimized out>) at misc.c:1553 > #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent ( > func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350, > parent=0x0) at threads.c:950 > #23 0x00007ffff7af98c0 in scm_with_guile ( > func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) > at threads.c:956 > #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1, > argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0) > at init.c:319 > #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478) > at guile.c:81 > > > [-- Attachment #2: Type: text/html, Size: 6576 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#14141: Abort in RTL VM 2013-04-04 19:44 ` Noah Lavine @ 2013-04-05 17:29 ` Noah Lavine 2013-04-06 14:49 ` Stefan Israelsson Tampe 0 siblings, 1 reply; 5+ messages in thread From: Noah Lavine @ 2013-04-05 17:29 UTC (permalink / raw) To: Noah Lavine; +Cc: 14141 [-- Attachment #1: Type: text/plain, Size: 6435 bytes --] Hello, Just a quick update - it seems to be related to the order of the reserve-locals and bind-rest calls. If I reverse those, the problem goes away. However, I still don't know why this happens, and why the problem doesn't happen when the variables aren't boxed. I think there might be something weird going on in bind-rest when the argument is zero. It has a loop like this: while (nargs-- > dst) { ... }. When dst is zero, doesn't nargs end up getting set to -1? (Which, since it's unsigned, is really 2^32 - 1.) That might make any later instructions that use nargs (like reserve-locals) do odd things. Noah On Thu, Apr 4, 2013 at 3:44 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote: > Oh, I forgot to mention one important fact. I *do* get the expected result > if I eliminate the stuff with boxes. This works fine: > > > scheme@(guile-user)> (assemble-program '((begin-program foo) > (assert-nargs-ge 0) > (reserve-locals 4) > (bind-rest 0) > (cache-current-module! 2 foo) > (cached-toplevel-ref 2 foo car) > (tail-call 1 2) > (end-program))) > > Best, > Noah > > On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com>wrote: > >> Hello, >> >> I'm actually testing on the wip-rtl-cps branch, but this error involves >> code that I believe is the same on that branch and on the wip-rtl branch. >> Try opening a new Guile and doing the following: >> >> scheme@(guile-user)> (use-modules (system vm rtl)) >> scheme@(guile-user)> (assemble-program '((begin-program foo) >> (assert-nargs-ge 0) >> (reserve-locals 4) >> (bind-rest 0) >> (box 1 0) >> (cache-current-module! 2 foo) >> (cached-toplevel-ref 2 foo car) >> (box-ref 3 1) >> (mov 0 3) >> (tail-call 1 2) >> (end-program))) >> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0> >> scheme@(guile-user)> ($1 'hello) >> >> The expected result is >> $2 = hello >> >> What I actually get is, >> >> Program received signal SIGABRT, Aborted. >> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 >> >> The full backtrace is below. The interesting part is that it seems to be >> tripping the check at libguile/vm-engine.c:1868, which checks whether an >> object is a variable before doing a box-ref on it. When I look at it in >> GDB, it seems that whatever is at register 1 does not satisfy >> scm_variable_p, although I'm not very experienced with debugging Guile. >> However, I am somewhat surprised at this, because I have used boxes and >> box-ref before in the past with no trouble. >> >> Another surprising thing is that if I open Guile, do some other things >> for a while, and then run this code, the problem sometimes doesn't appear. >> That is especially disturbing. >> >> Does anyone have any idea where the issue is or how I should find it? >> >> Thanks, >> Noah >> >> Here's the backtrace: >> >> (gdb) bt >> #0 0x00007ffff7440425 in raise () >> from /lib/x86_64-linux-gnu/libc.so.6 >> #1 0x00007ffff7443b8b in abort () >> from /lib/x86_64-linux-gnu/libc.so.6 >> #2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860, >> program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868 >> #3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860, >> program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419 >> #4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, >> program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791 >> #5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0) >> at eval.c:691 >> #6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0, >> module_or_state=0x7e0090) at eval.c:725 >> #7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478) >> at script.c:441 >> #8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1, >> argv=0x7fffffffe478) at guile.c:62 >> #9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350) >> at init.c:336 >> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220) >> at continuations.c:513 >> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360, >> args=0x304) at throw.c:146 >> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304) >> at smob.c:141 >> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860, >> program=0x7443e0, argv=0x7fffffffe0c0, nargs=2) >> at vm-i-system.c:873 >> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, >> program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791 >> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404, >> arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513 >> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler ( >> key=0x404, thunk=0x81b360, handler=0x81b340, >> pre_unwind_handler=0x81b320) at throw.c:86 >> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404, >> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, >> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, >> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, >> pre_unwind_handler_data=0x751160) at throw.c:213 >> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier ( >> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, >> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, >> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, >> pre_unwind_handler_data=0x751160) at continuations.c:451 >> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier ( >> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) >> at continuations.c:547 >> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290, >> base@entry=<error reading variable: value has been optimized out>, >> data=0x7fffffffe2d0, >> data@entry=<error reading variable: value has been optimized out>) >> at threads.c:907 >> #21 0x00007ffff71b6f55 in GC_call_with_stack_base ( >> fn=<optimized out>, arg=<optimized out>) at misc.c:1553 >> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent ( >> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350, >> parent=0x0) at threads.c:950 >> #23 0x00007ffff7af98c0 in scm_with_guile ( >> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) >> at threads.c:956 >> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1, >> argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0) >> at init.c:319 >> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478) >> at guile.c:81 >> >> >> > [-- Attachment #2: Type: text/html, Size: 7820 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#14141: Abort in RTL VM 2013-04-05 17:29 ` Noah Lavine @ 2013-04-06 14:49 ` Stefan Israelsson Tampe 2013-04-12 17:50 ` Noah Lavine 0 siblings, 1 reply; 5+ messages in thread From: Stefan Israelsson Tampe @ 2013-04-06 14:49 UTC (permalink / raw) To: Noah Lavine; +Cc: 14141 Yeah, you really found the problem. I would put a if(nargs) to guard the while just to make it more robust. /Stefan On Fri, Apr 5, 2013 at 7:29 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote: > Hello, > > Just a quick update - it seems to be related to the order of the > reserve-locals and bind-rest calls. If I reverse those, the problem goes > away. However, I still don't know why this happens, and why the problem > doesn't happen when the variables aren't boxed. > > I think there might be something weird going on in bind-rest when the > argument is zero. It has a loop like this: > > while (nargs-- > dst) { ... }. > > When dst is zero, doesn't nargs end up getting set to -1? (Which, since it's > unsigned, is really 2^32 - 1.) That might make any later instructions that > use nargs (like reserve-locals) do odd things. > > Noah > > > On Thu, Apr 4, 2013 at 3:44 PM, Noah Lavine <noah.b.lavine@gmail.com> wrote: >> >> Oh, I forgot to mention one important fact. I *do* get the expected result >> if I eliminate the stuff with boxes. This works fine: >> >> >> scheme@(guile-user)> (assemble-program '((begin-program foo) >> (assert-nargs-ge 0) >> (reserve-locals 4) >> (bind-rest 0) >> (cache-current-module! 2 foo) >> (cached-toplevel-ref 2 foo car) >> (tail-call 1 2) >> (end-program))) >> >> Best, >> Noah >> >> On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com> >> wrote: >>> >>> Hello, >>> >>> I'm actually testing on the wip-rtl-cps branch, but this error involves >>> code that I believe is the same on that branch and on the wip-rtl branch. >>> Try opening a new Guile and doing the following: >>> >>> scheme@(guile-user)> (use-modules (system vm rtl)) >>> scheme@(guile-user)> (assemble-program '((begin-program foo) >>> (assert-nargs-ge 0) >>> (reserve-locals 4) >>> (bind-rest 0) >>> (box 1 0) >>> (cache-current-module! 2 foo) >>> (cached-toplevel-ref 2 foo car) >>> (box-ref 3 1) >>> (mov 0 3) >>> (tail-call 1 2) >>> (end-program))) >>> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 609bc0> >>> scheme@(guile-user)> ($1 'hello) >>> >>> The expected result is >>> $2 = hello >>> >>> What I actually get is, >>> >>> Program received signal SIGABRT, Aborted. >>> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 >>> >>> The full backtrace is below. The interesting part is that it seems to be >>> tripping the check at libguile/vm-engine.c:1868, which checks whether an >>> object is a variable before doing a box-ref on it. When I look at it in GDB, >>> it seems that whatever is at register 1 does not satisfy scm_variable_p, >>> although I'm not very experienced with debugging Guile. However, I am >>> somewhat surprised at this, because I have used boxes and box-ref before in >>> the past with no trouble. >>> >>> Another surprising thing is that if I open Guile, do some other things >>> for a while, and then run this code, the problem sometimes doesn't appear. >>> That is especially disturbing. >>> >>> Does anyone have any idea where the issue is or how I should find it? >>> >>> Thanks, >>> Noah >>> >>> Here's the backtrace: >>> >>> (gdb) bt >>> #0 0x00007ffff7440425 in raise () >>> from /lib/x86_64-linux-gnu/libc.so.6 >>> #1 0x00007ffff7443b8b in abort () >>> from /lib/x86_64-linux-gnu/libc.so.6 >>> #2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860, >>> program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868 >>> #3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860, >>> program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419 >>> #4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, >>> program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791 >>> #5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0) >>> at eval.c:691 >>> #6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0, >>> module_or_state=0x7e0090) at eval.c:725 >>> #7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478) >>> at script.c:441 >>> #8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1, >>> argv=0x7fffffffe478) at guile.c:62 >>> #9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350) >>> at init.c:336 >>> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220) >>> at continuations.c:513 >>> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360, >>> args=0x304) at throw.c:146 >>> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304) >>> at smob.c:141 >>> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860, >>> program=0x7443e0, argv=0x7fffffffe0c0, nargs=2) >>> at vm-i-system.c:873 >>> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, >>> program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791 >>> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404, >>> arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513 >>> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler ( >>> key=0x404, thunk=0x81b360, handler=0x81b340, >>> pre_unwind_handler=0x81b320) at throw.c:86 >>> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404, >>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, >>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, >>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, >>> pre_unwind_handler_data=0x751160) at throw.c:213 >>> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier ( >>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, >>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, >>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, >>> pre_unwind_handler_data=0x751160) at continuations.c:451 >>> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier ( >>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) >>> at continuations.c:547 >>> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290, >>> base@entry=<error reading variable: value has been optimized out>, >>> data=0x7fffffffe2d0, >>> data@entry=<error reading variable: value has been optimized out>) >>> at threads.c:907 >>> #21 0x00007ffff71b6f55 in GC_call_with_stack_base ( >>> fn=<optimized out>, arg=<optimized out>) at misc.c:1553 >>> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent ( >>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350, >>> parent=0x0) at threads.c:950 >>> #23 0x00007ffff7af98c0 in scm_with_guile ( >>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) >>> at threads.c:956 >>> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1, >>> argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0) >>> at init.c:319 >>> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478) >>> at guile.c:81 >>> >>> >> > ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#14141: Abort in RTL VM 2013-04-06 14:49 ` Stefan Israelsson Tampe @ 2013-04-12 17:50 ` Noah Lavine 0 siblings, 0 replies; 5+ messages in thread From: Noah Lavine @ 2013-04-12 17:50 UTC (permalink / raw) To: Stefan Israelsson Tampe; +Cc: 14141 [-- Attachment #1: Type: text/plain, Size: 8239 bytes --] Case closed! (At least for now.) Although the bug in reserve-locals is real (you can check with the debugger), the program never actually got far enough for that to affect anything. Instead, here's what happened: - reserve-locals worked fine, reserving space for 4 local variables - bind-rest shrunk the stack back down, leaving enough space for only one local variable - the call to toplevel-ref invoked the old (non-RTL) VM, through line 479 of modules.c. - the old VM put its initial frame on the stack right after the stack pointer - but since the stack pointer had been decremented by bind-rest, that overwrote the 4 local variables in the RTL function. In particular, it overwrote the variable that held the box (fp[1], for the record). - after the old VM returned, the new VM continued, tried to use the incorrect fp[1] value, and aborted. So, mystery solved. Coming soon: patches to fix the bug you hit if you try to do reserve-locals after bind-rest ... Noah On Sat, Apr 6, 2013 at 10:49 AM, Stefan Israelsson Tampe < stefan.itampe@gmail.com> wrote: > Yeah, you really found the problem. > > I would put a if(nargs) to guard the while just to make it more robust. > > /Stefan > > > On Fri, Apr 5, 2013 at 7:29 PM, Noah Lavine <noah.b.lavine@gmail.com> > wrote: > > Hello, > > > > Just a quick update - it seems to be related to the order of the > > reserve-locals and bind-rest calls. If I reverse those, the problem goes > > away. However, I still don't know why this happens, and why the problem > > doesn't happen when the variables aren't boxed. > > > > I think there might be something weird going on in bind-rest when the > > argument is zero. It has a loop like this: > > > > while (nargs-- > dst) { ... }. > > > > When dst is zero, doesn't nargs end up getting set to -1? (Which, since > it's > > unsigned, is really 2^32 - 1.) That might make any later instructions > that > > use nargs (like reserve-locals) do odd things. > > > > Noah > > > > > > On Thu, Apr 4, 2013 at 3:44 PM, Noah Lavine <noah.b.lavine@gmail.com> > wrote: > >> > >> Oh, I forgot to mention one important fact. I *do* get the expected > result > >> if I eliminate the stuff with boxes. This works fine: > >> > >> > >> scheme@(guile-user)> (assemble-program '((begin-program foo) > >> (assert-nargs-ge 0) > >> (reserve-locals 4) > >> (bind-rest 0) > >> (cache-current-module! 2 foo) > >> (cached-toplevel-ref 2 foo car) > >> (tail-call 1 2) > >> (end-program))) > >> > >> Best, > >> Noah > >> > >> On Thu, Apr 4, 2013 at 3:22 PM, Noah Lavine <noah.b.lavine@gmail.com> > >> wrote: > >>> > >>> Hello, > >>> > >>> I'm actually testing on the wip-rtl-cps branch, but this error involves > >>> code that I believe is the same on that branch and on the wip-rtl > branch. > >>> Try opening a new Guile and doing the following: > >>> > >>> scheme@(guile-user)> (use-modules (system vm rtl)) > >>> scheme@(guile-user)> (assemble-program '((begin-program foo) > >>> (assert-nargs-ge 0) > >>> (reserve-locals 4) > >>> (bind-rest 0) > >>> (box 1 0) > >>> (cache-current-module! 2 foo) > >>> (cached-toplevel-ref 2 foo car) > >>> (box-ref 3 1) > >>> (mov 0 3) > >>> (tail-call 1 2) > >>> (end-program))) > >>> ... ... ... ... ... ... ... ... ... ... $1 = #<rtl-program dcec90 > 609bc0> > >>> scheme@(guile-user)> ($1 'hello) > >>> > >>> The expected result is > >>> $2 = hello > >>> > >>> What I actually get is, > >>> > >>> Program received signal SIGABRT, Aborted. > >>> 0x00007ffff7440425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 > >>> > >>> The full backtrace is below. The interesting part is that it seems to > be > >>> tripping the check at libguile/vm-engine.c:1868, which checks whether > an > >>> object is a variable before doing a box-ref on it. When I look at it > in GDB, > >>> it seems that whatever is at register 1 does not satisfy > scm_variable_p, > >>> although I'm not very experienced with debugging Guile. However, I am > >>> somewhat surprised at this, because I have used boxes and box-ref > before in > >>> the past with no trouble. > >>> > >>> Another surprising thing is that if I open Guile, do some other things > >>> for a while, and then run this code, the problem sometimes doesn't > appear. > >>> That is especially disturbing. > >>> > >>> Does anyone have any idea where the issue is or how I should find it? > >>> > >>> Thanks, > >>> Noah > >>> > >>> Here's the backtrace: > >>> > >>> (gdb) bt > >>> #0 0x00007ffff7440425 in raise () > >>> from /lib/x86_64-linux-gnu/libc.so.6 > >>> #1 0x00007ffff7443b8b in abort () > >>> from /lib/x86_64-linux-gnu/libc.so.6 > >>> #2 0x00007ffff7b30986 in rtl_vm_debug_engine (vm=0x6a6860, > >>> program=0xdcec90, argv=0x6a9548, nargs_=1) at vm-engine.c:1868 > >>> #3 0x00007ffff7b1aaf1 in vm_debug_engine (vm=0x6a6860, > >>> program=0xdcec90, argv=0x7fffffffd028, nargs=1) at vm-engine.c:419 > >>> #4 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, > >>> program=0x75dbe0, argv=0x7fffffffd028, nargs=1) at vm.c:791 > >>> #5 0x00007ffff7a5bff3 in scm_primitive_eval (exp=0x7fe7f0) > >>> at eval.c:691 > >>> #6 0x00007ffff7a5c0ad in scm_eval (exp=0x7fe7f0, > >>> module_or_state=0x7e0090) at eval.c:725 > >>> #7 0x00007ffff7acef2d in scm_shell (argc=1, argv=0x7fffffffe478) > >>> at script.c:441 > >>> #8 0x0000000000400bd0 in inner_main (closure=0x0, argc=1, > >>> argv=0x7fffffffe478) at guile.c:62 > >>> #9 0x00007ffff7a82663 in invoke_main_func (body_data=0x7fffffffe350) > >>> at init.c:336 > >>> #10 0x00007ffff7a563c9 in c_body (d=0x7fffffffe220) > >>> at continuations.c:513 > >>> #11 0x00007ffff7afc96c in apply_catch_closure (clo=0x81b360, > >>> args=0x304) at throw.c:146 > >>> #12 0x00007ffff7acf739 in apply_1 (smob=0x81b360, a=0x304) > >>> at smob.c:141 > >>> #13 0x00007ffff7b05cc8 in vm_regular_engine (vm=0x6a6860, > >>> program=0x7443e0, argv=0x7fffffffe0c0, nargs=2) > >>> at vm-i-system.c:873 > >>> #14 0x00007ffff7b38f6c in scm_c_vm_run (vm=0x6a6860, > >>> program=0x79d5d0, argv=0x7fffffffe0c0, nargs=4) at vm.c:791 > >>> #15 0x00007ffff7a5b793 in scm_call_4 (proc=0x79d5d0, arg1=0x404, > >>> arg2=0x81b360, arg3=0x81b340, arg4=0x81b320) at eval.c:513 > >>> #16 0x00007ffff7afc767 in scm_catch_with_pre_unwind_handler ( > >>> key=0x404, thunk=0x81b360, handler=0x81b340, > >>> pre_unwind_handler=0x81b320) at throw.c:86 > >>> #17 0x00007ffff7afca44 in scm_c_catch (tag=0x404, > >>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, > >>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, > >>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, > >>> pre_unwind_handler_data=0x751160) at throw.c:213 > >>> #18 0x00007ffff7a5623d in scm_i_with_continuation_barrier ( > >>> body=0x7ffff7a563a1 <c_body>, body_data=0x7fffffffe220, > >>> handler=0x7ffff7a563d8 <c_handler>, handler_data=0x7fffffffe220, > >>> pre_unwind_handler=0x7ffff7a5642c <pre_unwind_handler>, > >>> pre_unwind_handler_data=0x751160) at continuations.c:451 > >>> #19 0x00007ffff7a564c3 in scm_c_with_continuation_barrier ( > >>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) > >>> at continuations.c:547 > >>> #20 0x00007ffff7af97ba in with_guile_and_parent (base=0x7fffffffe290, > >>> base@entry=<error reading variable: value has been optimized out>, > >>> data=0x7fffffffe2d0, > >>> data@entry=<error reading variable: value has been optimized out>) > >>> at threads.c:907 > >>> #21 0x00007ffff71b6f55 in GC_call_with_stack_base ( > >>> fn=<optimized out>, arg=<optimized out>) at misc.c:1553 > >>> #22 0x00007ffff7af9894 in scm_i_with_guile_and_parent ( > >>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350, > >>> parent=0x0) at threads.c:950 > >>> #23 0x00007ffff7af98c0 in scm_with_guile ( > >>> func=0x7ffff7a82613 <invoke_main_func>, data=0x7fffffffe350) > >>> at threads.c:956 > >>> #24 0x00007ffff7a825f4 in scm_boot_guile (argc=1, > >>> argv=0x7fffffffe478, main_func=0x400bac <inner_main>, closure=0x0) > >>> at init.c:319 > >>> #25 0x0000000000400c35 in main (argc=1, argv=0x7fffffffe478) > >>> at guile.c:81 > >>> > >>> > >> > > > [-- Attachment #2: Type: text/html, Size: 10836 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-04-12 17:50 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-04-04 19:22 bug#14141: Abort in RTL VM Noah Lavine 2013-04-04 19:44 ` Noah Lavine 2013-04-05 17:29 ` Noah Lavine 2013-04-06 14:49 ` Stefan Israelsson Tampe 2013-04-12 17:50 ` Noah Lavine
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).