From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ken Raeburn Newsgroups: gmane.lisp.guile.devel Subject: Re: a plan for native compilation Date: Wed, 21 Apr 2010 13:02:37 -0400 Message-ID: <848DF216-C44E-41AE-8601-FD7BBE197FB2@raeburn.org> References: <156F3B77-A4B9-4F82-9C01-7D2E49115B89@raeburn.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v1078) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1271869379 15891 80.91.229.12 (21 Apr 2010 17:02:59 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 21 Apr 2010 17:02:59 +0000 (UTC) Cc: guile-devel To: Andy Wingo Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Apr 21 19:02:58 2010 connect(): No such file or directory Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O4dKC-0002mU-MA for guile-devel@m.gmane.org; Wed, 21 Apr 2010 19:02:57 +0200 Original-Received: from localhost ([127.0.0.1]:39445 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O4dKC-0004ru-6v for guile-devel@m.gmane.org; Wed, 21 Apr 2010 13:02:56 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1O4dK9-0004rQ-Hl for guile-devel@gnu.org; Wed, 21 Apr 2010 13:02:53 -0400 Original-Received: from [140.186.70.92] (port=47776 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O4dK7-0004qa-8H for guile-devel@gnu.org; Wed, 21 Apr 2010 13:02:53 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1O4dK4-0001Lv-3m for guile-devel@gnu.org; Wed, 21 Apr 2010 13:02:51 -0400 Original-Received: from splat.raeburn.org ([69.25.196.39]:52782 helo=raeburn.org) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O4dJv-0001K9-LP for guile-devel@gnu.org; Wed, 21 Apr 2010 13:02:48 -0400 Original-Received: from squish.raeburn.org (squish.raeburn.org [10.0.0.172]) by raeburn.org (8.14.3/8.14.1) with ESMTP id o3LH2bIe028582; Wed, 21 Apr 2010 13:02:37 -0400 (EDT) In-Reply-To: X-Mailer: Apple Mail (2.1078) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:10278 Archived-At: On Apr 18, 2010, at 07:41, Andy Wingo wrote: > Specifically, we should make it so that there is nothing you would = want > to go to a core file for. Compiling Scheme code to native code should > never produce code that segfaults at runtime. All errors would still = be > handled by the catch/throw mechanism. Including a segfault in compiled Scheme code, caused by an = application-supplied C procedure returning something that looks like one = of the pointer-using SCM objects but is in reality just garbage? There = *will* be core files. >> * Debug info in native representations, handled by GDB and other >> debuggers. Okay, this is hard if we don't go via C code as an >> intermediate language, and probably even if we do. But we can = probably >> at least map PC address ranges to function names and line numbers, >> stuff like that. Maybe we could do the more advanced stuff one format >> at a time, starting with DWARF. >=20 > We should be able to do this already; given that we map bytecode = address > ranges to line numbers, and the function is on the stack still you you > can query it for whatever you like. Adding a map when generating = native > code should be easy. I think for best results with GDB and other debuggers, it should be = converted into whatever the native format is, DWARF or otherwise. > I would actually like to switch our compiled-code on-disk format to be = a > subset of ELF, so we can have e.g. a bytecode section, a native code > section, sections for RO and RW data, etc. But that would take a fair > amount of thinking. And if it's actually compatible with ELF, would make special handling of = compiled Scheme + compiled C possible on ELF platforms but not others, = leading to two different ways of potentially building stuff (or, people = supporting only ELF platforms in their packages, whether intentionally = or not; or, people not bothering using the non-portable special = handling). Which is why I was suggesting native formats rather than ELF = specifically -- more work up front, but more uniform treatment of = platforms in the build process. >> * With some special compile-time hooks, perhaps FFI symbol references >> could turn into (weak?) direct symbol references, processed with >> native relocation handling, etc. >=20 > This might improve startup times (marginally?), but it wouldn't affect > runtimes, would it? Depending how it's done, it might improve the first reference to a = symbol very slightly. You could (again, depending how it's done) = perhaps trigger link-time errors if a developer forgets to supply = libraries defining symbols the Scheme code knows will be required, = instead of a delayed run-time error. >> * Even for JIT compilation, but especially for AOT compilation, >> optimizations should only be enabled with careful consideration of >> concurrent execution. E.g., if "(while (not done) ....)" is supposed >> to work with a second thread altering "done", you may not be able to >> combine multiple cases of reading the value of any variable even when >> you can prove that the current thread doesn't alter the value in >> between. >=20 > Fortunately, Scheme programming style discourages global variables ;) > Reminds me of "spooky action at a distance". And when they are read, = it > is always through an indirection, so we should be good. Who said global? It could be two procedures accessing a value in a = shared outer scope, with one of them launched in a second thread, = perhaps indirectly via a third procedure which the compiler couldn't = examine at the time to know that it would create a thread. I'm not sure indirection helps -- unless you mean it disables that sort = of optimization. > Of course. Sandboxed code of course should not have access to mutexes = or > the FFI or many other things. Though it is an interesting point, that > resources that you provide to sandboxed code should be threadsafe, if > the sandbox itself has threads. Actually, I'm not sure that mutexes should be forbidden, especially if = you let the sandbox create threads. But they should be well-protected, = bullet-proof mutexes; none of this "undefined behavior" stuff. :-) >> * Link compiled C and Scheme parts of a package together into a = single >> shared library object, [....] >=20 > This is all very hard stuff! Maybe somewhat. The "big char array" transformation wouldn't be that = hard, I think, though we'd clearly be going outside the bounds of what a = C99 compiler is *required* to support in terms of array size. Slap a C = struct wrapper on it (or C++, which would give you an encoding system = for multiple names in a hierarchy, though with different character set = limitations), and you've basically got an object file ready to be = created. Then you just have to teach libguile how not to read files for = some modules. >> * Can anything remotely reasonable happen when C++ code calls Scheme >> code which calls C++ code ... with stack-unwinding cleanup code >> specified in both languages, and an exception is raised? [....] >=20 > I have no earthly idea :) It only just occurred to me. It may be worth looking at the C++ plus = Java case, and see if something reasonable happens there, especially = with the GNU tools in particular. My hunch is that we might be able to do it, but would need to compile at = least a little C++ code into the library to do it portably. That = wouldn't be hard, as I doubt there are many platforms where you get a C = but not C++ compiler these days, but I don't know if the C++ ABI work = has progressed far enough along that it wouldn't tie us to a specific = C++ implementation (on platforms with more than one), or how much of an = issue that would be. Worst case, we might have to run all the libguile = code through the C++ compiler, to get stack-unwinding data recorded for = EH processing; while there's a fairly large common subset of C and C++, = it would still be annoying. It might have to be crude, too -- for example, maybe on the C++ side = we'd define a "Scheme exception" type that normally would not be caught = specially by application code (though it could be), and perhaps Scheme = wouldn't be able to catch C++ exceptions at all, just do the unwinding. Just an idea to keep in mind.... >=20 >> Looking forward to Emacs work: >>=20 >> Tom Tromey recently pointed out some JIT compilation work done on >> Emacs byte code back in 2004, with the conclusion that while some >> improvement is possible, the time spent in existing primitives >> dominates the execution time. Playing devil's advocate for a minute: >> Why do you think we can do better? Or was this modest improvement -- >> maybe a bit more for AOT compilation -- all you were expecting when >> you said we could run elisp faster than Emacs? >=20 > Better for emacs? Well I don't think we should over-sell speed, if > that's what you're getting at. Hey, you're the one who said, "Guile can implement Emacs Lisp better = than Emacs can." :-) And specifically said that Emacs using Guile would = be faster. > Bytecode-wise, the performace will > probably be the same. I suspect the same code in Scheme will run = faster > than Elisp, due to lexical scoping, and a richer set of bytecode > primitives. But I think the goal for phase 1 should be "no one will > notice" ;-) The initial work, at least, wouldn't involve a rewrite of Lisp into = Scheme. So we still need to support dynamic scoping of, well, just = about anything. >=20 > Native-code compilation will make both Scheme and Elisp significantly > faster -- I think 4x would be a typical improvement, though one would > find 2x and 20x as well. For raw Scheme data processing, perhaps. Like I said, I'm concerned = about how much of the performance of Emacs is tied to that of the Emacs = C code (redisplay, buffer manipulation, etc) and that part probably = wouldn't improve much if at all. So a 4x speedup of actual Emacs Lisp = code becomes ... well, a much smaller speedup of Emacs overall. > More broadly, though, I don't really believe in the long-term health = of > a system that relies on primitives for speed, because such a system > necessarily restricts the expressive power of the extension language. > There are many things you just can't do in Emacs these days -- and > sometimes it's things as basic as "display all of the messages in my > archived folder". Making the extension language more capable allows = for > more programs to be written inside Emacs. Eventually we will even > migrate many of the primitives out of C, and back into Elisp or = Scheme. I'd like to see that, and I think many Emacs developers would as well. >> On my reasonably fast Mac desktop, Emacs takes about 3s to launch and >> load my .emacs file. >=20 > How long does emacs -Q take? Maybe about 1s less? >> During the build, pre-loading the Lisp code takes it about another = 3s, >> that would get added to the startup time without unexec. If loading >> native compiled files (or .go files on platforms where we don't have >> native compilation yet) isn't so amazingly fast as to cut that down = to >> 2-3s, do you have any ideas how we might be able to load and save an >> initialized Lisp environment? >=20 > I think we'll just have to see, unfortunately. Currently our mess of = .go > files everywhere means that you get significantly different numbers > depending on whether you're in the disk cache or not... Perhaps we can > make a quick estimate just based on KSLOC? How many KSLOC get loaded = in > a base emacs -Q ? Not sure, I'll take a look. >> I'm also pondering loading different Lisp files in two or three >> threads in parallel, when dependencies allow, but any manipulation of >> global variables has to be handled carefully, as do any load-time >> errors. (One thread blocks reading, while another executes >> already-loaded code... maybe more, to keep multiple cores busy at >> once.) >=20 > This is a little crazy ;-) Only a little? Modern machines are going more and more in the multicore direction. = Even without that, often a thread blocks waiting to read stuff off disk, = while another could continue doing work. Why should my Emacs startup = stall waiting on the disk any more than it absolutely needs to? (Also, = POSIX has async file i/o routines now, so "prefetching" the file = contents is also an option; conceptually in the same thread, though it = could be implemented with extra threads under the covers.) Ken=