From: Andy Wingo <wingo@pobox.com>
To: guile-devel <guile-devel@gnu.org>
Subject: vm status update, a.k.a. yowsers batman it's february already
Date: Mon, 02 Feb 2009 21:28:46 +0100 [thread overview]
Message-ID: <m3iqnsfurl.fsf@pobox.com> (raw)
Greets greets,
An update from the wilds of the vm branch is overdue. So here we go:
* Opcodes are now numbered statically in the source. This should make
it easier to maintain bytecode compatibility in the future.
* The VM now has two new languages:
- Bytecode, which is just object code as u8vectors; and
- Assembly, which is between bytecode and GLIL.
The differences may be seen thusly (hand-pretty-printed):
scheme@(guile-user)> (compile '(car '(a . b)) #:to 'glil)
#<glil (program 0 0 0 0 ()
(const (a . b))
(call car 1)
(call return 0))>
scheme@(guile-user)> (compile '(car '(a . b)) #:to 'assembly)
(load-program 0 0 0 0 () 13 #f
(load-symbol "a")
(load-symbol "b")
(cons)
(car)
(return))
scheme@(guile-user)> (compile '(car '(a . b)) #:to 'bytecode)
#u8(0 0 0 0 13 0 0 0 0 0 0 0 63 0 0 1 97 63 0 0 1 98 90 91 48)
scheme@(guile-user)> (compile '(car '(a . b)) #:to 'objcode)
#<objcode b728e450>
* As you can see, the bytecode header is quite long -- it's 12 bytes
before we get to the meat of the program. (That's 4 for arity, 4 for
length, and 4 for meta-length -- more on that in a minute). But this
is OK, because normally this is read-only code mmapped directly from
disk.
* Originally, when loading programs with meta-data (such as source
information), we had to load all of that metadata up along with the
program -- symbols, vectors, conses, etc. So then we hid that loading
behind a thunk, so we just had to cons up a thunk -- but still that
was 8 words (4 for the program and 4 for the object code).
So instead now we just stick the meta-thunk after the main program
text, and load it only when objcode-meta (or program-meta) is called.
Voici source information without cost! I stole this trick from the
Self compiler.
* Just as we have a tower compilers (and thus languages), we now have a
tower of /decompilers/. Currently I've only implemented
value->objcode (only valid for values of type program or objcode),
objcode->bytecode, and bytecode->assembly, but it's possible to
implement passes decompiling all the way back to Scheme.
Or JavaScript! That's the crazy thing: since multiple languages exist
on top of one substrate, decompilers allow us to do language
translation -- what Guile originally wanted to do, but as artifact
rather than as mechanism.
* Because we put the 4-byte lengths in the objcode directly, and mmap
that data, bytecode is now endian-specific. Specifically, it's all
little-endian right now. I know, I know. Worse, it's not aligned. But
provisions are there to make it aligned and native endian.
***
So, what's up?
Well, things are good. Load time is slightly faster, though we still can
be significantly faster. We cons less than the evaluator. Things are
looking good, and improvable.
I have to fix the endianness/alignment bits.
There are two main regressions. One is a simple bug: backtraces aren't
working right unless you have VM code. I think it's a simple problem
with stack cutting, I have to poke it a bit.
Secondly, GOOPS loads *really slowly*, because of the dynamic
recompilation things that I thought were so clever. I don't know exactly
what to do yet -- profile and see, I guess. I think this is my first
priority right now.
As far as improvements go, there's a laundry list:
* I'm going to try bytecodes being uint32's instead of uint8's. We'll
see what the performance impacts are.
* I'm going to see about coalescing object tables into one vector per
compilation unit, e.g. file. This should result in faster startup
time.
* GOOPS needs some love. I think polymorphic inline caches are the way
to go, and might be a first way to test out a native-code generator
(for the cache stubs).
* It would be nice to have syncase in by default, though perhaps we
should leave this for R6RS.
* Decompilers to GLIL, GHIL, and Scheme would be *sweet*.
* I think there's something publishable in all of this language tower
business, but I'd need a convincing second high-level language. I
think JavaScript is the right one. We need to write a compiler to
GHIL, and probably extend the VM slightly.
With luck, I'd like to present this fall at the SFP -- any takers
for help? We could present together :)
Well, that's what's on my mind for now. I'll work at updating the docs
soon, funny to have them bitrot so quickly.
Cheers,
Andy
--
http://wingolog.org/
next reply other threads:[~2009-02-02 20:28 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-02 20:28 Andy Wingo [this message]
2009-02-02 20:50 ` vm status update, a.k.a. yowsers batman it's february already Ludovic Courtès
2009-02-03 23:34 ` Neil Jerram
2009-02-04 0:34 ` Andy Wingo
2009-02-05 20:43 ` Neil Jerram
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3iqnsfurl.fsf@pobox.com \
--to=wingo@pobox.com \
--cc=guile-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).