From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: rtl metadata musings Date: Fri, 10 May 2013 07:07:31 +0200 Message-ID: <871u9fqv70.fsf@pobox.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1368162463 9024 80.91.229.3 (10 May 2013 05:07:43 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 10 May 2013 05:07:43 +0000 (UTC) To: guile-devel Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Fri May 10 07:07:43 2013 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UafYU-0002nK-Fp for guile-devel@m.gmane.org; Fri, 10 May 2013 07:07:42 +0200 Original-Received: from localhost ([::1]:51521 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UafYT-0002WO-Su for guile-devel@m.gmane.org; Fri, 10 May 2013 01:07:41 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:42798) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UafYP-0002WF-UQ for guile-devel@gnu.org; Fri, 10 May 2013 01:07:39 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UafYO-0006k7-In for guile-devel@gnu.org; Fri, 10 May 2013 01:07:37 -0400 Original-Received: from a-pb-sasl-quonix.pobox.com ([208.72.237.25]:40155 helo=sasl.smtp.pobox.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UafYO-0006ju-EF for guile-devel@gnu.org; Fri, 10 May 2013 01:07:36 -0400 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 7BCD8CAB2 for ; Fri, 10 May 2013 01:07:34 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to :subject:date:message-id:mime-version:content-type; s=sasl; bh=N 9lWLQNj2mwOeDDryzcSRLlFGUU=; b=wulgNfy8P0ioT7uF1TkOlmxGi/Pgfqtwb Am4qSkcmEA+jj7PmJfXyNeDYD5UwNbm4TgWmH0XUFbVg4UzU6+jLV59KCcyvcdU2 nPuqR2cAXFdOB6TtocCWoBz0AlegGOTgT+qTl4i3fRW3Zvgtl8g1zXntrYLq52pg fULniZ5BYw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:subject :date:message-id:mime-version:content-type; q=dns; s=sasl; b=Tmf xl2CnLVq1YRYl9cqiiJHbRZB0ENJZsURguog/CL+hh0U2f1VWsppYTqgV+nXvDvZ P0BxLSJYw2eSWHqrAajlglHLF9Iup7wWgF8rk4vxIkYu+4UlqIXdYrWKb1PiBvhg D3IDy96fUo2zA9M10gx4B3TCymyIKF5D2Zfml9xA= Original-Received: from a-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 70710CAB1 for ; Fri, 10 May 2013 01:07:34 -0400 (EDT) Original-Received: from badger (unknown [88.160.190.192]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTPSA id C0139CAB0 for ; Fri, 10 May 2013 01:07:33 -0400 (EDT) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) X-Pobox-Relay-ID: 861B47CC-B92F-11E2-BAAB-9F710E5B5709-02397024!a-pb-sasl-quonix.pobox.com X-detected-operating-system: by eggs.gnu.org: Solaris 10 X-Received-From: 208.72.237.25 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:16364 Archived-At: Hi, For many days I have been hemming and hawing about how to serialize debugging information into the new toolchain. Here's a braindump of a new plan. To recap, the new toolchain has the new RTL assembly embedded in ELF. There are 6 things we need to put in the ELF file somehow: (1) Procedure names and bounds. (2) Docstrings. (3) Generic procedure metadata (for procedure-properties). (4) Arity information (see docs for program-arities). (5) Information about local variables for the debugger. (6) Line numbers. None of these things are "on the main path" -- loading a module shouldn't even page any of this information into memory. But it is all useful to have, and sometimes you need to be able to access it efficiently if it is there. All of this data should be strippable from the .go files (which I guess we should rename to .so files). This constraint means there should be no link from the "main" data out to the "debugging" data -- only the other way around. Otherwise stripping debug data could corrupt your main program. So those are the design constraints. For (1) we use the standard ELF .symtab / .strtab mechanism. For the rest I had considered encoding it all into DWARF, but I think it can make sense to leave DWARF to handle the things that it knows best like (5) and (6) and to provide special support for (2), (3), and (4). You should be able to strip these different pieces separately. (2): For docstrings, my idea is to make a .guile.docstr section with entries like this: struct guile_docstring { Elf_Addr pc; Elf_Off str; } The "pc" is the rtl-program-code, and the "str" is an offset into the linked (via the section's sh_link member) .guile.docstrtab section. Searching for a docstring does a bisection over the .guile.docstr for a (rtl-program-code prog) and then loads the string from the table. (3): Of course it's possible for a procedure's "documentation" property to not be a string, and procedures can have any number of other properties: (lambda () #((foo . qux) (bar . "hi") ...) 10) Procedures with extended metadata get an entry in .guile.procprops: struct guile_procprops { Elf_Addr pc; Elf_Addr data; } Here "data" points to an "absolute" address of the property alist, which is part of the .data section along with any other program literal data. (The address is absolute relative to the ELF image; at runtime you have to add the base address the image is loaded at.) As you might know, literals like conses are statically allocated in the ELF memory image, but if they contain links to non-immediates like symbols or other conses, those links need to be patched up when the ELF is loaded. In this way, generic procedure metadata does contribute to runtime cost, because it needs relocation. But it's not that common, not too much work, and you don't need a guile_procprops entry if you don't have extended metadata. (4) Arity information describes the arities of the various case-lambda clauses that a function has. This information is used when printing a function, to show the formals, and also when compiling, to check arities. It would be cleaner to have the compiler emit separate functions for the different clauses, but that's not what happens now. Anyway the plan is for another section, .guile.arities: struct guile_arity { Elf_Addr pc; Elf_Off size; nreq; // encodings for these not determined yet nopt; flags; // has-keyword-args, has-rest, is-case-lambda Elf_Offset offset; } An entry describes how many required, optional, keyword, and rest arguments a function has. The .guile.arities section is prefixed by a length indicating how many entries there are, then all the arity structures, sorted by pc. Note that one arity may contain another! In particular for case-lambda clauses you can have one arity for the whole function, then a number of other ones for the cases. After the arities, you have a block of offsets to another string table to give the names and to give more information on keywords. So all in all it looks like this: Elf_Off n_arity_entries; struct guile_arity foo_arity = { PC, SIZE, 1, 2, 0, OFFSET } ... OFFSET: X -> offset into associated .guile.arities_strtab for first req. arg Y -> offset into associated .guile.arities_strtab for first opt. arg Y -> offset into associated .guile.arities_strtab for second opt. arg offsets for next function... Like metadata, keyword arguments would have an absolute address to the .data section to link to the keywords literal associated with this clause. In this way we can share storage for formal parameters, have easy access to arities without too much searching or consing, and also be able to strip the arities section if needed without affecting anything else. (5) and (6): Local variable information and line numbers can go into .debug_info / .debug_lines / .debug_str as usual with DWARF. DWARF does well for this. Not sure if I want to try to encode arity information into DWARF; at least in the beginning it won't be necessary, so I'll avoid it. OK this thought was burning my neuron this morning and I wanted to get it out. I'll start working on it shortly. Andy -- http://wingolog.org/