unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* vm status update, a.k.a. yowsers batman it's february already
@ 2009-02-02 20:28 Andy Wingo
  2009-02-02 20:50 ` Ludovic Courtès
  2009-02-03 23:34 ` Neil Jerram
  0 siblings, 2 replies; 5+ messages in thread
From: Andy Wingo @ 2009-02-02 20:28 UTC (permalink / raw)
  To: guile-devel

Greets greets,

An update from the wilds of the vm branch is overdue. So here we go:

 * Opcodes are now numbered statically in the source. This should make
   it easier to maintain bytecode compatibility in the future.

 * The VM now has two new languages:

   - Bytecode, which is just object code as u8vectors; and

   - Assembly, which is between bytecode and GLIL.

   The differences may be seen thusly (hand-pretty-printed):

     scheme@(guile-user)> (compile '(car '(a . b)) #:to 'glil)
     #<glil (program 0 0 0 0 ()
                 (const (a . b))
                 (call car 1)
                 (call return 0))>
     scheme@(guile-user)> (compile '(car '(a . b)) #:to 'assembly)
     (load-program 0 0 0 0 () 13 #f
        (load-symbol "a")
        (load-symbol "b")
        (cons)
        (car)
        (return))
     scheme@(guile-user)> (compile '(car '(a . b)) #:to 'bytecode)
     #u8(0 0 0 0 13 0 0 0 0 0 0 0 63 0 0 1 97 63 0 0 1 98 90 91 48)
     scheme@(guile-user)> (compile '(car '(a . b)) #:to 'objcode)
     #<objcode b728e450>

 * As you can see, the bytecode header is quite long -- it's 12 bytes
   before we get to the meat of the program. (That's 4 for arity, 4 for
   length, and 4 for meta-length -- more on that in a minute). But this
   is OK, because normally this is read-only code mmapped directly from
   disk.
 
 * Originally, when loading programs with meta-data (such as source
   information), we had to load all of that metadata up along with the
   program -- symbols, vectors, conses, etc. So then we hid that loading
   behind a thunk, so we just had to cons up a thunk -- but still that
   was 8 words (4 for the program and 4 for the object code).

   So instead now we just stick the meta-thunk after the main program
   text, and load it only when objcode-meta (or program-meta) is called.
   Voici source information without cost! I stole this trick from the
   Self compiler.

 * Just as we have a tower compilers (and thus languages), we now have a
   tower of /decompilers/. Currently I've only implemented
   value->objcode (only valid for values of type program or objcode),
   objcode->bytecode, and bytecode->assembly, but it's possible to
   implement passes decompiling all the way back to Scheme.

   Or JavaScript! That's the crazy thing: since multiple languages exist
   on top of one substrate, decompilers allow us to do language
   translation -- what Guile originally wanted to do, but as artifact
   rather than as mechanism.

 * Because we put the 4-byte lengths in the objcode directly, and mmap
   that data, bytecode is now endian-specific. Specifically, it's all
   little-endian right now. I know, I know. Worse, it's not aligned. But
   provisions are there to make it aligned and native endian.

***

So, what's up?

Well, things are good. Load time is slightly faster, though we still can
be significantly faster. We cons less than the evaluator. Things are
looking good, and improvable.

I have to fix the endianness/alignment bits.

There are two main regressions. One is a simple bug: backtraces aren't
working right unless you have VM code. I think it's a simple problem
with stack cutting, I have to poke it a bit.

Secondly, GOOPS loads *really slowly*, because of the dynamic
recompilation things that I thought were so clever. I don't know exactly
what to do yet -- profile and see, I guess. I think this is my first
priority right now.

As far as improvements go, there's a laundry list:

  * I'm going to try bytecodes being uint32's instead of uint8's. We'll
    see what the performance impacts are.

  * I'm going to see about coalescing object tables into one vector per
    compilation unit, e.g. file. This should result in faster startup
    time.

  * GOOPS needs some love. I think polymorphic inline caches are the way
    to go, and might be a first way to test out a native-code generator
    (for the cache stubs).

  * It would be nice to have syncase in by default, though perhaps we
    should leave this for R6RS.

  * Decompilers to GLIL, GHIL, and Scheme would be *sweet*.

  * I think there's something publishable in all of this language tower
    business, but I'd need a convincing second high-level language. I
    think JavaScript is the right one. We need to write a compiler to
    GHIL, and probably extend the VM slightly.

    With luck, I'd like to present this fall at the SFP -- any takers
    for help? We could present together :)

Well, that's what's on my mind for now. I'll work at updating the docs
soon, funny to have them bitrot so quickly.

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-02-05 20:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-02 20:28 vm status update, a.k.a. yowsers batman it's february already Andy Wingo
2009-02-02 20:50 ` Ludovic Courtès
2009-02-03 23:34 ` Neil Jerram
2009-02-04  0:34   ` Andy Wingo
2009-02-05 20:43     ` Neil Jerram

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).