unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* thoughts on native code
@ 2012-11-10 14:41 Stefan Israelsson Tampe
  2012-11-10 22:06 ` Stefan Israelsson Tampe
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Israelsson Tampe @ 2012-11-10 14:41 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 2523 bytes --]

Hi all,

After talking with Mark Weaver about his view on native code, I have been
pondering how to best model our needs.

I do have a framework now that translates almost all of the rtl vm directly
to native code and it do shows a speed increase of say 4x compared to
runing a rtl VM. I can also generate rtl code all the way from guile scheme
right now so It's pretty easy to generate test cases. The problem that Mark
point out to is that we need to take care to not blow the instructuction
cache. This is not seen in these simple examples but we need larger code
bases to test out what is actually true. What we can note though is that I
expect the size of the code to blow up with a factor of around 10 compared
to the instruction feed in the rtl code.

One interesting fact is that SBCL does fairly well by basically using the
native instruction as the instruction flow to it's VM. For example if it
can deduce that a + operation works with fixnums it simply compiles that as
a function call to a general + routine e.g. it will do a long jump to the +
routine, do the plus, and longjump back essentially dispatching general
instructions like + * / etc, directly e.g. sbcl do have a virtual machine,
it just don't to table lookup to do the dispatch, but function call's in
stead. If you count longjumps this means that the number of jumps for these
instructions are double that of using the original table lookup methods.
But for calling functions and returning functions the number of longjumps
are the same and moving local variables in place , jumping  is really fast.

Anyway, this method of dispatching would mean a fairly small footprint with
respect to the direct assembler. Another big chunk of code that we can
speedup without to much bloat in the instruction cache is the lookup of
pairs, structs and arrays, the reason is that in many cases we can deduce
at compilation so much that we do not need to check the type all the time
but can safely lookup the needed infromation.

Now is this method fast? well, looking a the sbcl code for calculating 1+ 2
+ 3 + 4 , (disassembling it) I see that it do uses the mechanism above, it
manages to sum 150M terms in one second, that's quite a feat for a VM with
no JIT. The same with the rtl VM is 65M.

Now, sbcl's compiler is quite matured and uses native registers quite well
which explains one of the reasons why the speed. My point is though that we
can model efficiently a VM by call's and using the native instructions and
a instructions flow.

Regards Stefan

[-- Attachment #2: Type: text/html, Size: 2608 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-11-15 22:44 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-10 14:41 thoughts on native code Stefan Israelsson Tampe
2012-11-10 22:06 ` Stefan Israelsson Tampe
2012-11-10 22:49   ` Noah Lavine
2012-11-12 21:50     ` Stefan Israelsson Tampe
2012-11-15 10:19       ` Sjoerd van Leent Privé
2012-11-15 16:04         ` Stefan Israelsson Tampe
2012-11-15 16:13         ` Stefan Israelsson Tampe
2012-11-15 17:50         ` Mark H Weaver
2012-11-15 18:03           ` Stefan Israelsson Tampe
2012-11-15 20:30             ` Ludovic Courtès
2012-11-15 22:44         ` Andreas Rottmann
2012-11-11 19:28   ` Stefan Israelsson Tampe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).