Re: a plan for native compilation

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

From: Ken Raeburn <raeburn@raeburn.org>
To: Andy Wingo <wingo@pobox.com>
Cc: guile-devel <guile-devel@gnu.org>
Subject: Re: a plan for native compilation
Date: Sat, 17 Apr 2010 22:19:36 -0400	[thread overview]
Message-ID: <156F3B77-A4B9-4F82-9C01-7D2E49115B89@raeburn.org> (raw)
In-Reply-To: <m3zl13zm9r.fsf@pobox.com>

Good stuff, Andy!

On Apr 16, 2010, at 07:09, Andy Wingo wrote:
> Currently, Guile has a compiler to a custom virtual machine, and the
> associated toolchain: assemblers and disassemblers, stack walkers, the
> debugger, etc. One can get the source location of a particular
> instruction pointer, for example.

These are great... but if they're run-time features of Guile, they're useless when examining a core file.

It would be awesome if GDB could display this information when debugging a process, *and* when looking at a core file.  (For both JIT and AOT compilation, of course.)  It doesn't currently know about Scheme and Guile, so obviously some work would need to be done on that side.  It's got some rather clunky looking hooks for providing debug info associated with JIT compilation, which I think I mentioned in IRC a while back.  Maybe the GDB developers could be persuaded to support a more direct way of supplying debug info than the current mechanism, such as a pointer to DWARF data.  GDB would also need to learn about Scheme and Guile specifically, which would take cooperation from both groups.

Obviously, when looking at a core file, no helper code from the library can be executed.  Perhaps less obviously, with a live process, when doing simple things like looking at symbol values, you probably don't want to execute library code if it means enabling other threads to resume executing for a while as well.

GDB 7 supports supplying Python code to pretty-print selected object types, or defining new commands.  We could supply Python code for looking at SCM objects, maybe even walking the stack, if that turns out to be practical with the interfaces GDB supplies.

> So, my thought is to extend procedures with an additional pointer, a
> pointer to a "native" code structure. The native code could be written
> out ahead-of-time, or compiled at runtime. But procedures would still
> have bytecode, for various purposes for example to enable code coverage
> via the next-instruction hook, and in the JIT case, because only some
> procedures will be native-compiled.

I wondered about this, when looking briefly at what JIT compilation would need to generate given certain byte codes.  Would you generate code based on the debug or non-debug versions of the instructions?  What would the choice depend on?  Can both bytecode evaluators be used in one process and with the same bytecode object?

What about when profiling for performance?

> We keep the same stack representation, so stack walkers and the debugger
> still work. Some local variables can be allocated into registers, but
> procedure args are still passed and returned on the stack. Though the
> procedure's arity and other metadata would be the same, the local
> variable allocations and source locations would differ, so we would need
> some additional debugger support, but we can work on that when the time
> comes.

The "call" sequence would have to work a little differently from now, I think.  As you describe:

> All Scheme procedures, bytecode and native, will run inside the VM. If a
> bytecode procedure calls a native procedure, the machine registers are
> saved, and some machine-specific stub transfers control to the native
> code. Native code calling native code uses the same stack as the VM,
> though it has its own conventions over what registers to save; and
> native code calling bytecode prepares the Scheme stack, then restores
> the VM-saved machine registers.

Does the native code figure out if it's jumping to byte code or machine code, or does it use some transfer stub?

> AIUI the hotspot compiler actually does an SSA transformation of Java
> bytecode, then works on that. I'm not particularly interested in
> something like that; I'm more interested in something direct and fast,
> and obviously correct and understandable by our debugging
> infrastructure.

Though as you say, we can experiment later with additional changes.  If there's some heavily-used dynamically-generated code, it may be worth the extra effort, but we can find that out after we've got something working.

> Anyway, just some thoughts here. I'm not going to focus on native
> compilation in the coming months, as there are other things to do, but
> this is how I think it should be done :-)

Some random thoughts of my own:

Several possible options for AOT compilation (e.g., generating C or assembly and using native tools) could involve the generation of native object files.  It seems tempting to me to see how much we might be able to use the native C/C++/Fortran/etc method or do something parallel:

* Debug info in native representations, handled by GDB and other debuggers.  Okay, this is hard if we don't go via C code as an intermediate language, and probably even if we do.  But we can probably at least map PC address ranges to function names and line numbers, stuff like that.  Maybe we could do the more advanced stuff one format at a time, starting with DWARF.

* Code and read-only data sections shared across processes; read-write data mapped in copy-on-write.

* Loading Guile modules via dlopen or system runtime linker means they'd be visible to debuggers.

* With some special compile-time hooks, perhaps FFI symbol references could turn into (weak?) direct symbol references, processed with native relocation handling, etc.

* Linking multiple object files together into a single "library" object that can be loaded at once; possibly with cross-file optimization.

* Even for JIT compilation, but especially for AOT compilation, optimizations should only be enabled with careful consideration of concurrent execution.  E.g., if "(while (not done) ....)" is supposed to work with a second thread altering "done", you may not be able to combine multiple cases of reading the value of any variable even when you can prove that the current thread doesn't alter the value in between.

** Be especially careful if you want to be able to have Guile create a limited sandbox in which to run untrusted code.  Assume that the provider of the code will attempt to avoid mutexes and use race conditions and FFI pointer handling and opportunities for data corruption and such, in order to break out of the sandbox.

* Link compiled C and Scheme parts of a package together into a single shared library object, instead of the code in one language needing to know where the object for the other language is (awkward if you're trying to relocate the whole bundle via LD_LIBRARY_PATH) and explicitly load it.  (Perhaps a library initialization function could call a Guile library function to say, "if you need module (foo bar baz), it's mapped in at this address and is this big, and this much is read-only", or "here's a pointer to the struct Foo describing it, including pointers to various components".  Or we could generate C symbols reflecting module names and make the library explicitly make them known to the Guile library.)  If nothing else, the current .go file could be turned into a large character array....

* Can anything remotely reasonable happen when C++ code calls Scheme code which calls C++ code ... with stack-unwinding cleanup code specified in both languages, and an exception is raised?  Can the cleanup code be invoked in both languages?  (This applies to the bytecode interpreter as well, but the mechanism for compiled code would have to be different, as I believe C++/Ada/etc EH support typically maps PC address to handler info; I don't know how Java is handled under JIT compilation.)

* Did I mention how cool it would be to have GDB support? :-)

Looking forward to Emacs work:

Tom Tromey recently pointed out some JIT compilation work done on Emacs byte code back in 2004, with the conclusion that while some improvement is possible, the time spent in existing primitives dominates the execution time.  Playing devil's advocate for a minute: Why do you think we can do better?  Or was this modest improvement -- maybe a bit more for AOT compilation -- all you were expecting when you said we could run elisp faster than Emacs?

I'm hoping that AOT compilation will speed up the initial Lisp loading disproportionately though.  A lot of it is just loading function definitions, executing small blobs of Lisp code (like, create this keymap, then fill in this entry, then fill in that one, then assign it to this variable; or, add this property to this symbol with this value) and -- I *think* -- not relying too heavily on the built-in subrs that we can't speed up, and not doing any display updates, stuff like that.  But I'm still concerned about doing it at startup time rather than using the "unexec" mechanism Emacs currently uses to pre-initialize all the C and Lisp stuff and dump out an image that can be launched more quickly.

On my reasonably fast Mac desktop, Emacs takes about 3s to launch and load my .emacs file.  During the build, pre-loading the Lisp code takes it about another 3s, that would get added to the startup time without unexec.  If loading native compiled files (or .go files on platforms where we don't have native compilation yet) isn't so amazingly fast as to cut that down to 2-3s, do you have any ideas how we might be able to load and save an initialized Lisp environment?

One thing that might speed up the loading of .go files is making them more compact; there seems to be a lot of string duplication in the current format.  (Try running "strings module/ice-9/boot-9.go | sort | uniq -c | sort -n" to get a list of strings and the numbers of times they appear, sorted by count.)

I'm also pondering loading different Lisp files in two or three threads in parallel, when dependencies allow, but any manipulation of global variables has to be handled carefully, as do any load-time errors.  (One thread blocks reading, while another executes already-loaded code... maybe more, to keep multiple cores busy at once.)

... Sorry, that's a lot of tangents to be going off onto. :-)

Ken

next prev parent reply	other threads:[~2010-04-18  2:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-16 11:09 a plan for native compilation Andy Wingo
2010-04-16 20:47 ` No Itisnt
2010-04-17 10:21   ` Andy Wingo
2010-04-17 21:20     ` Ludovic Courtès
2010-04-16 23:15 ` Ludovic Courtès
2010-04-17 11:19   ` Andy Wingo
2010-04-18  2:19 ` Ken Raeburn [this message]
2010-04-18 11:41   ` Andy Wingo
2010-04-21 17:02     ` Ken Raeburn
2010-04-22 11:28       ` Andy Wingo
2010-04-18 20:40   ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=156F3B77-A4B9-4F82-9C01-7D2E49115B89@raeburn.org \
    --to=raeburn@raeburn.org \
    --cc=guile-devel@gnu.org \
    --cc=wingo@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).