From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Jacob Bachmeyer Newsgroups: gmane.emacs.devel Subject: Re: Emacs contributions, C and Lisp Date: Sat, 10 Jan 2015 17:45:02 -0600 Message-ID: <54B1B97E.9070204@gmail.com> Reply-To: jcb62281@gmail.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1420943178 365 80.91.229.3 (11 Jan 2015 02:26:18 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 11 Jan 2015 02:26:18 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Jan 11 03:26:13 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YA8EH-0004d1-8W for ged-emacs-devel@m.gmane.org; Sun, 11 Jan 2015 03:26:13 +0100 Original-Received: from localhost ([::1]:57053 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YA8EG-0006NT-MQ for ged-emacs-devel@m.gmane.org; Sat, 10 Jan 2015 21:26:12 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40757) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YA5iX-00040S-12 for emacs-devel@gnu.org; Sat, 10 Jan 2015 18:45:18 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YA5iT-0004PD-Pe for emacs-devel@gnu.org; Sat, 10 Jan 2015 18:45:16 -0500 Original-Received: from mail-oi0-x236.google.com ([2607:f8b0:4003:c06::236]:34648) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YA5iT-0004Oz-JL for emacs-devel@gnu.org; Sat, 10 Jan 2015 18:45:13 -0500 Original-Received: by mail-oi0-f54.google.com with SMTP id u20so16373818oif.13 for ; Sat, 10 Jan 2015 15:45:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:reply-to:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=f3xa1WYEPJ5dz03fbId5ZsZX/5KGWreoTcpvr2UU5lI=; b=CrSUJUqSomIl0rxetc720c780yPzZfW40DROqrWZG8zw5Wq4zyLatsSxDFNLvptoUO 6TcFVIV5KF6oh8+P+YZVQZcFUImcZsAHVvskRWCMYp8LAcP3G097jK9AuBSBxbqi45iO JWgmcOmpu+3zY+qz9Fq/v5scGThUAQv7rCHqj6+0VS2Q93CEtz+1g1THexxO7ezODF9Y t7Xk2tYZNOQxK646ghOjDUhUBoG+b+d9QiISNFtvxNK26hGcJV1rtKrM2ehrabAOnE1i to+I36ZmKm8QM+c9ioAIZYsATaDADRYXlTh1hs/OTpr9HjFW7gV3AQuItC2cs5U/lw+L 5UoQ== X-Received: by 10.202.7.142 with SMTP id 136mr5213992oih.51.1420933510500; Sat, 10 Jan 2015 15:45:10 -0800 (PST) Original-Received: from [192.168.2.42] (adsl-70-133-148-241.dsl.ablntx.sbcglobal.net. [70.133.148.241]) by mx.google.com with ESMTPSA id s10sm6483752oeo.3.2015.01.10.15.45.09 for (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 10 Jan 2015 15:45:09 -0800 (PST) User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.22) Gecko/20090807 MultiZilla/1.8.3.4e SeaMonkey/1.1.17 Mnenhy/0.7.6.0 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:4003:c06::236 X-Mailman-Approved-At: Sat, 10 Jan 2015 21:26:11 -0500 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:181139 Archived-At: I've been reading this in the list archives and as a long-time GNU/Linux user, feel the need to chime in. Perhaps there is a better option? I seem to remember efforts to adapt Guile to run Emacs Lisp and then to port Emacs to run using Guile instead of its own runtime. I'm not certain of the difficulty, but perhaps GCC could be, over time, moved towards an option to build as Guile extensions? I haven't looked far enough into this to know if it is feasible, or how much work would be needed, or if I'm completely mistaken and it isn't feasible at all. Obviously, it should still be possible to build "stand-alone GCC", but having the compiler be available from Guile could easily solve the issue at hand here, especially if the extension presents a Lisp-like API for the GCC internal structures. This would also address the concerns about the GCC front-end being abused to feed nonfree backends, since the tree would only be present in memory behind a GPL interface. But this is years away at best and doesn't solve the immediate problem, which is that Emacs needs a parse tree "now". There are three options for how to get that: (1) Write a C parser in Emacs Lisp. (2) Get the AST from GCC. (3) Get the AST from Clang. Option (1) leads to an Emacs C Compiler and fully self-hosting Emacs, which is both interesting and an insane duplication of effort. Option (2) has the advantage that it would ensure full support for GNU C, but the problem of actually getting the parse tree from GCC without weakening GCC's copyleft. Option (3) has the advantage that no one will object to dumping an AST from Clang, but Clang isn't a GNU project and has incomplete support for GNU C. A more useful question is "How can GCC most efficiently provide an AST to Emacs?" Part of the answer is that Emacs already has the complete text of every file involved. Emacs doesn't care what the name of a variable is--that's already in the buffer. Emacs cares what part of the buffer corresponds to a variable name. Dumping an AST that contains only annotations to text, referring to positions in the source files instead of actually including text in the AST, looks to me like a good middle ground. Such an AST (which I will call a "refAST" because it contains only references to program source text) would be a significant pain to use as compiler input, since the symbol names are missing, while also being exactly the information that an editor needs. We can make it harder to use the refAST to abuse the GCC frontend by the same expedient that makes the refAST easier for Emacs to read. Emacs already has the source text, why force it to read duplicates from the AST dump? Further, the refAST needs to resemble the source text as closely as possible. Most of GCC's value is from the optimizer and code generators. Parsing is relatively simple compared to the rest of GCC. If the refAST is dumped after optimization, it will be next to useless for editing the source. So the refAST must be dumped prior to any optimization. My knowledge of GCC internals is lacking, but a glance at gccint suggests that Emacs needs a dump of GENERIC, which, incidentally, can "also be used in the creation of source browsers, intelligent editors, ..." (). Further reading reveals that for better or for worse, this ship has already sailed and GCC has had an option to dump GIMPLE representation, which is probably far more useful for abusing the frontend than an AST dump, for some time now. In short, the earlier in the GCC pipeline the parse tree is dumped, the more useful the dump is for editing source and the less useful the dump is for feeding a nonfree compiler backend. Dumping references to source text, but not the text itself, simultaneously makes reading the dump into Emacs easier and feeding the dump into another backend harder. My proposal: --Immediate: -- GCC option for dumping refAST for editor use -- parse tree is dumped as early as is feasible, definitely prior to optimization --Near term: -- Emacs Lisp ported to Guile --Longer term: -- GCC buildable as Guile extensions -- provides full access to GCC internal structures, but only to Free software -- Emacs ported to Guile PS: There have been questions raised as to the use of a full syntax tree. One feature that I would find useful that would be trivially enabled by having the syntax tree would be to make M-C-t next to a binary operator swap its operand expressions. This is a contrived example, but shows a simple case: a * b +. c * d C: a * c +. b * d Lisp: c * d +. a * b The dot represents point. The result labeled "C" is what happens currently. The result labeled "Lisp" is what would happen if Emacs actually understood C syntax on the same level as s-expressions. A refAST would enable this as a side-effect of its other uses.