From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ken Raeburn Newsgroups: gmane.lisp.guile.devel Subject: Re: guile and emacs: unexec Date: Sun, 14 Jun 2009 01:21:37 -0400 Message-ID: References: <78272C83-1BCF-4316-9A26-12F9510D4704@raeburn.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v935.3) Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1244956928 18821 80.91.229.12 (14 Jun 2009 05:22:08 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 14 Jun 2009 05:22:08 +0000 (UTC) Cc: guile-devel To: Andy Wingo Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sun Jun 14 07:22:05 2009 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MFiAO-00078V-Fo for guile-devel@m.gmane.org; Sun, 14 Jun 2009 07:22:05 +0200 Original-Received: from localhost ([127.0.0.1]:59052 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MFiAK-00086d-K1 for guile-devel@m.gmane.org; Sun, 14 Jun 2009 01:22:00 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MFiAC-00084n-1Q for guile-devel@gnu.org; Sun, 14 Jun 2009 01:21:52 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MFiA7-0007yt-J8 for guile-devel@gnu.org; Sun, 14 Jun 2009 01:21:51 -0400 Original-Received: from [199.232.76.173] (port=48735 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MFiA7-0007yp-EB for guile-devel@gnu.org; Sun, 14 Jun 2009 01:21:47 -0400 Original-Received: from raeburn.org ([69.25.196.97]:37569) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MFi9z-0004v7-3v for guile-devel@gnu.org; Sun, 14 Jun 2009 01:21:47 -0400 Original-Received: from [10.0.0.172] ([10.0.0.172]) by raeburn.org (8.14.3/8.14.1) with ESMTP id n5E5LbjE013829; Sun, 14 Jun 2009 01:21:37 -0400 (EDT) In-Reply-To: X-Mailer: Apple Mail (2.935.3) X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:8680 Archived-At: On Jun 13, 2009, at 09:06, Andy Wingo wrote: > Hi Ken, > > On Fri 12 Jun 2009 07:02, Ken Raeburn writes: > >> I'm glad to see the emacs-lisp work is progressing. As it happens, a >> month or so ago I blew some of the dust off my old guile-emacs >> project >> and started working on it again too. This flavor of emacs+guile work >> aimed to replace Lisp objects in Emacs with Guile objects at the >> lowest >> level (numbers, cons cells; symbols and such become smobs) and then >> work upwards from there. > > Very interesting! To be clear -- the goal would be to represent as > much > of Emacs using cheap Guile structures as possible: numbers and cons > cells and such, and represent specific Emacs objects as smobs? That's > probably a good idea. Yes -- for now, that includes anything I haven't converted, including strings, symbols, vectors of objects, hash tables, etc. Many of what are currently smobs should eventually be converted to using Guile's versions, either directly or with some simple wrapping. They need to stay identifiable in Lisp as the correct object types, so I can't implement a Lisp string with, say, a Guile list containing a string plus text-property data. In the long term perhaps some of them could be implemented partly or fully in Scheme, but I don't want to diverge radically while I still need to track the main Emacs code base. (Please let me keep the illusion that replacing the fundamental object representation, allocator and garbage collector, and compensating for initialization problems throughout the code, isn't all that radical a change. :-) But I'm not worrying about that much right now -- if the representation abstraction is complete and correct, the existing Emacs code should be able to pull out all the right data from the smobs, and results should be indistinguishable. Well, except that integer and floating ranges may be different, hash table ordering changes -- simple, reasonable, and well-understood differences. It's not quite there yet. I figure, once I've got this set of changes working correctly (i.e., nearly indistinguishable, no random unexplained errors or differences in behavior), then I can tackle the next steps with more confidence that differences observed there are due to the new changes in progress, not semantic differences previously introduced with further- reaching effects than I expected. It's also kind of appealing to have something at intermediate stages that I might be able to show off, and say "hey, this works well enough that you can try it out; want to help me on the next steps?" (And since I'm getting into all this now, I *would* like some help. I was just intending to fix a few more problems before making the plea. :-) I'm specifically *not* trying to do some of the other things that have been discussed but aren't about running Emacs -- make buffers independent objects that can be used outside of Emacs, stuff like that. That can come later (or not), and I'd be glad to see it happen, but getting Emacs running at all is a big enough project for me on my own. > Symbols however should probably be represented as Guile symbols, not > smobs. I think that you will find that with a more compilation-centric > approach, we will be able to keep more simple datatypes, as we compile > the procedures that operate on those data types to appropriate code. Eventually, yes, I think so. They should probably be one of the next things to change, though some like vectors and strings might be simpler. I'm also concerned about the performance impact of making such a switch; another reason for getting something working soon is so it's practical to look at performance questions. >> I've updated to recent Emacs sources and Guile 1.8.6. I've gotten it >> to a point where it seems to start up fine in tty mode, reads in (and >> does color highlighting of) C files and directories, does some other >> basic stuff. I'm tweaking it now to see if I can get more stuff >> working (like Cocoa support and "make bootstrap") and do more >> extensive testing. > > Very neat! That's fantastic that you were able to get it this far, I > didn't know that was possible. I actually had it pretty far along once or twice before (I seem to keep reviving this every few years, and spend a lot of time updating to newer code bases), but I think I've managed to push it a bit further than I had it earlier. With just me working on it, depending on the demands of my job, there tend to be large periods when no progress gets made, and it doesn't keep up with the upstream sources; the prospect of having to do a bunch of catch-up work just makes it that much less appealing to get back into it. It's been moving forward in spurts for over a decade now, very slowly. :-( > If this is an effort that you want to pay off in the future, though, I > would strongly suggest updating to the 1.9/2.0 series of Guile. The > expressive range of Guile's multilingual facilities is much higher > there, and significantly different from 1.8. I was looking at updating, but ran into the -I ordering problem I reported. Since that's fixed, I'll try again sometime. The multilingual facilities aren't very important to me right now -- like I said above, I'm mostly just switching some object representations now, and I'm still using the Emacs code for any multilingual stuff. Eventually that should change, but what I want of Guile right now is a nice, simple byte array I can stick string data into. :-) Emacs 23 is going to go out with the Emacs version of the support, and yanking out anything made available to Lisp programmers isn't going to go over very well. Of course, it wouldn't be very good to wind up with duplicated work, or redundant or conflicting interfaces, either. > OTOH, the emacs lisp support is not yet up to the level that it is > at in > 1.8, so perhaps now is not yet the time. And, I haven't started using any of that code yet, either... that's another big change to try at some point when everything else is looking solid. And, I assume it expects the use of Guile symbols and Guile strings at least? In order to make this switch, too, the semantics really have to match Emacs Lisp -- stuff like indirect symbols, buffer- or frame-local bindings, etc. And all the Emacs C code needs to know how to look up values (or function values, or property lists, or whatever) when given Guile symbols. And then there's the lexical binding branch work, which I haven't even looked at yet. >> One really big hiccup I've run into, which I've sort of sidestepped >> for >> the moment: Guile is not unexec-friendly. >> >> There is a way to build Emacs so it doesn't use unexec, but it then >> has >> to load a lot of Lisp code at run time, really killing the startup >> performance, and I don't think it's tested all that much (e.g., "make >> bootstrap" doesn't work even without the Guile hacks). To really >> make >> this project work, I need to be able to link against Guile (static is >> fine, and probably necessary), do a bunch of Lisp/Scheme processing, >> write out a memory image into a new executable, and later be able to >> run that executable. > > It's true that Guile doesn't do unexec currently. It might in the > future -- obviously it will if you implement it of course ;) > > But I would ask that you reconsider your approach to making Guile- > Emacs > load quickly. There is no a priori reason that loading Lisp code > should > be slow. With Guile-compiled elisp, loading a file is just mapping it > into memory -- the same as you have with an image. The loaded code > needs > to be run to establish definitions, but that is a very quick > operation. I don't think the current Lisp reader is all that slow, but it has to load and run quite a bit of stuff, especially with the internationalization support. Especially during a "bootstrap" operation, when most of the stuff it loads is uncompiled Lisp source code. It seems to me that switching to Guile-compiled elisp for startup would require, well, basically most of the remaining work of my project, including switching to the Guile-based Lisp reader and evaluator, wouldn't it? So we're looking at some non-trivial changes here. They're desirable changes, in the long run, but taking this route would mean no efficient startup of guile-emacs any time soon, which in turn slows down the development cycle. The unexec support may be useless once we get there, but right now it's a much shorter path to something useable I can show off. (Fixing up the "interactive scheme mode" that talks to Guile directly would be nice to show off, too. My current one is kind of a lame hack.) > I agree that heap saving could be slightly faster. But I think that > Emacs should be able to load from bytecode within 100 ms or so /with > the > current Guile-VM code/ -- and even faster if we do native ahead-of- > time > compilation at some point. I'd certainly like to get there eventually. Really, it comes down to wanting something I can make work now, instead of a project with minimal, uninteresting intermediate results that may or may not pay off in another decade or so, and doesn't get anyone else interested in helping out. With the current state of Emacs, that means unexec is kind of needed. It can sort of work without it, but not well -- and that's true of the upstream Emacs code base too, but no one on that side cares very much because unexec works for them everywhere. I've got some political concerns here too. There has also been some resistance, when this project has come up on the Emacs lists, to switching away from the current Lisp evaluator for any reason, even if Guile support is added (it's not broken, major changes involve significant risk, don't see the benefit, etc); there's also been support, but it's contentious. So my rather vague plan has involved putting off even addressing that possible switch until I can show clear advantages and no blatant drawbacks (like performance, or correctness, or handling of out-of-memory conditions) to using Guile. I'd rather not discuss it from a position of weakness and uncertainty; better to have working code we can experiment with and numbers we can point to. (But first, let's experiment and generate numbers ourselves, and see if we need to fix bugs.) Then we can discuss our options. I don't know how much chance there would be for getting it ready in time for Emacs 24, but with enough help, I think Emacs 25 should be doable; possibly even 24, who knows? >> Any record of current threads needs to go away, and be replaced with >> info on the new one-and-only thread in the new process; I'm building >> without thread support for now to get around it. Any record of stack >> regions to be scanned for SCM objects likewise needs resetting. >> Allocated objects must *not* go away, and must continue to be >> processed >> by the garbage collector, so I can't just reinitialize everything. >> Assigned smob types must remain in effect, and for now I'm >> ignoring the >> possibility that some smobs may need some kind of reinitialization. >> Mutexes... well, I don't know if they need reinitializing; POSIX is >> kind of unclear on interactions with unexec. :-) I expect >> reinitializing them is probably safe, even if not required in some >> implementations. > > This could be complicated if we merge in the BDW-GC branch, to use > libgc. Note that SCM does have unexec, IIRC, we could steal parts of > their implementation That might work, yes... or if not, it sounds like I'd be stuck with using an old Guile, or getting the CANNOT_DUMP option working and suffering with the slow startup. (And, this reminds me -- there are still some likely GC-related bugs with scm_leave_guile/scm_enter_guile that should be fixed up. I got them removed from the API years ago, but they're still used internally in threads.c, down below the comment with my old email explaining the doom they may bring upon us. Does BDW-GC scrap that code finally? Please?) >> Is this something that could be useful to anyone outside of Emacs? > > Unexec certainly could, to deliver self-contained binaries. But TBH I > think the booting-from-compiled-files option is more maintainable. In > any case this would be a neat hack. Have fun! :) I agree, compiled files would work better, but I doubt we can push the Emacs folks to move in that direction first. They're happy with unexec for now. >> P.S. If anyone wants to take a look at my current work, >> http://www.mit.edu/~raeburn/guilemacs/guile-emacs.tar.bz2 >> has a snapshot from tonight. > > Cool! Have you considered using git, and branching from Emacs' git > mirror? That way it is trivial to set up something other people can > comment on, in easily-digestible patch chunks. Yep, but I need to get proficient with it first, and haven't put in the time yet; until then I'm using subversion in a rather clumsy fashion (often just checkpointing untested merges, and my Emacs sources have the CVS admin files checked in so I can update easily). If it's something other people want to actually work on, on the other hand, we could set up something via sourceforge or savannah or whatever. But only if there's actually going to be additional help coming.... Ken