Hi,

Branching now works and a big enough subset of the VM is translatable for
some interesting
benchmarks to be done.

So by skipping the goto structure a the win is maybe 3-4x for simple
numerical loops. I do expect
these loop ta be another factor of 2 when the wip-rtl is translated in the
same way. The
reason is that the overhead mainly consists of the instructions that move
things to and from the cache and rtl seams to decrease the number of such
operations. I've been incrementing fixnums and walked
some through lists of size 10000 to measure these numbers.

One thing to note with that code are that it piggy-packs onto the C-stack
and is not working with it's own.
I bet that is not optimal but that's what I did and it should mean that
it's fast to switch to C-code from the
native compiled or jit compiled ones.

Have fun!
/Stefan