From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Nala Ginrut Newsgroups: gmane.lisp.guile.devel Subject: Re: thinking out loud: wip-rtl, ELF, pages, and mmap Date: Mon, 29 Apr 2013 13:47:33 +0800 Organization: HFG Message-ID: <1367214453.12294.35.camel@Renee-desktop.suse> References: <874nevhe4a.fsf@pobox.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1367214466 10248 80.91.229.3 (29 Apr 2013 05:47:46 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 29 Apr 2013 05:47:46 +0000 (UTC) Cc: guile-devel To: Andy Wingo Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Mon Apr 29 07:47:50 2013 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UWgwD-00081h-7u for guile-devel@m.gmane.org; Mon, 29 Apr 2013 07:47:45 +0200 Original-Received: from localhost ([::1]:36403 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UWgwC-0003CS-N5 for guile-devel@m.gmane.org; Mon, 29 Apr 2013 01:47:44 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:41820) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UWgw7-0003CI-QY for guile-devel@gnu.org; Mon, 29 Apr 2013 01:47:41 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UWgw6-0003xE-GE for guile-devel@gnu.org; Mon, 29 Apr 2013 01:47:39 -0400 Original-Received: from mail-pd0-f182.google.com ([209.85.192.182]:39206) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UWgw6-0003x1-8D for guile-devel@gnu.org; Mon, 29 Apr 2013 01:47:38 -0400 Original-Received: by mail-pd0-f182.google.com with SMTP id 14so1868744pdj.13 for ; Sun, 28 Apr 2013 22:47:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:subject:from:to:cc:date:in-reply-to :references:organization:content-type:x-mailer:mime-version :content-transfer-encoding; bh=bDtfEDmSt0mGK4OXKJL4SjWWhpb/BB/A6qptLQ9KjBk=; b=PUb2JgJEqv7I5m0SACQsDzh7Su9J4n2Xt6nSGlZ5NmcgSeGVhWHYfgBHI74Qd1YVKF QGAYJ1Wn8eKe/fsO/xNZbbMu+diN3Ov3FVOjDh4NELPlkhVgmJSU8DidqvDOSt+XR1wA QEkmCf6vkK5GFG1sWk2QjHaWENo+inkvLoIwM528N0mRPuPMKg9GbcZRzlOYuN3LiXHA skbAizlWyu8SFChFu3m5hx1sW2Ajry6Y6S7qkYBllNLjdonpv5DRh6Gsxl4reQLZhlC6 +td5qOj/Jh1lPCSBwDGr598xLmGdwR0hTfL5ljxQlOnKhZnU3iGg3kEJBHG8nhZwQtJN C0PA== X-Received: by 10.69.1.39 with SMTP id bd7mr14856860pbd.188.1367214457119; Sun, 28 Apr 2013 22:47:37 -0700 (PDT) Original-Received: from [192.168.100.103] ([59.40.126.204]) by mx.google.com with ESMTPSA id cq1sm22604055pbc.13.2013.04.28.22.47.34 for (version=SSLv3 cipher=RC4-SHA bits=128/128); Sun, 28 Apr 2013 22:47:36 -0700 (PDT) In-Reply-To: <874nevhe4a.fsf@pobox.com> X-Mailer: Evolution 3.4.4 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy] X-Received-From: 209.85.192.182 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:16306 Archived-At: On Wed, 2013-04-24 at 22:23 +0200, Andy Wingo wrote: > Hi, > > I've been working on wip-rtl recently. The goal is to implement good > debugging. I'll give a bit of background and then get to my problem. > > In master, ".go" files are written in the ELF format. ELF is nice > because it embodies common wisdom on how to structure object files, and > this wisdom applies to Guile fairly directly. To simplify, ELF files > are cheap to load and useful to introspect. The former is achieved with > "segments", which basically correspond to mmap'd blocks of memory. The > latter is achieved by "sections", which describe parts of the file. The > table of segments is usually written at the beginning of the file, to > make loading easier, and the table of sections is usually at the end, as > it's not usually needed at runtime. There are usually fewer segments > than sections. You can have segments in the file that are marked as not > being loaded into memory at runtime. Usually this is the case for > debugging information. > > OK, so that's ELF. The conventional debugging format to use with ELF is > DWARF, and it's pretty well thought out. In Guile we'll probably use > DWARF, along with some more basic metadata in .symtab sections. > I'm very glad to see that ;-) And we it's possible to debug .go with GDB. > I should mention that in master, the ELF files are simple wrappers over > 2.0-style objcode. The wip-rtl branch takes more advantage of ELF -- > for example, to allocate some constants in read-only shareable memory, > and to statically allocate any constants that need initialization or > relocation at runtime. ELF also has advantages when we start to do > native compilation: native code can go in another section, for example. > Seems rtl's compiling is faster, at least for boot-9.scm But I didn't give it a test. It's possible to have more than one external AOT compiler except the official inner one. Maybe it's unnecessary. > * * * > > OK, so that's the thing. I recently added support for writing .symtab > sections, and have been looking on how to load that up at runtime, for > example when disassembling functions. To be complete, there are a few > other common operations that would require loading debug information: > > * Procedure names. > * Line/column information, for example in backtraces. > * Arity information and argument names. > * Local variable names and live ranges (the ,locals REPL command). > * Generic procedure metadata. > And I hope there's the number of begin line and the end line for a procedure. It's easy to record it when compiling. If no, I have to parse the source file to confirm it, and provide the source code printing in REPL/debugger. > Anyway! How do you avoid loading this information at runtime? > IMO, we should provide the strip command to guild. Or vice versa, --debug to the compile option. Let users decide whether to keep the debug info. > The original solution I had in mind was to put them in ELF segments that > don't get loaded. Then at runtime you would somehow map from an IP to > an ELF object, and at that point you would lazily load the unloaded ELF > sections. > > But that has a few disadvantages. One is that it's difficult to ensure > that the lazily-loaded object is the same as the one that you originally > loaded. We don't keep .go file descriptors open currently, and > debugging would be a bad reason to do so. > > Another more serious is that this is a lot of work, actually. There's a > constant overhead of the data about what is loaded and how to load what > isn't, and the cross-references from the debug info to the loaded info > is tricky. > > Then I realized: why am I doing all of this if the kernel has a virtual > memory system already that does all this for me? > > So I have a new plan, I think. I'll change the linker to always emit > sections and segments that correspond exactly in their on-disk layout > and in their in-memory layout. (In ELF terms: segments are contiguous, > with p_memsz == p_filesz.) I'll put commonly needed things at the > beginning, and debugging info and the section table at the end. Then > I'll just map the whole thing with PROT_READ, and set PROT_WRITE on > those page-aligned segments that need it. (Obviously in the future, > PROT_EXEC as well.) > Yeah, when we have AOT ;-P > Then I'll just record a list of ELF objects that have been loaded. > Simple bisection will map IP -> ELF, and from there we have the section > table in memory (lazily paged in by the virtual memory system) and can > find the symtab and other debug info. > > So that's the plan. It's a significant change, and I wondered if folks > had some experience or reactions. > > Note that we have a read()-based fallback if mmap is not available. > This strategy also makes the read-based fallback easier. > > Thoughts? > > Andy