From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Engster Newsgroups: gmane.emacs.devel Subject: Re: Emacs contributions, C and Lisp Date: Mon, 12 Jan 2015 21:41:41 +0100 Message-ID: <87twzvrat6.fsf@engster.org> References: <83bnxuzyl4.fsf@gnu.org> <87fvn0senq.fsf@uwakimon.sk.tsukuba.ac.jp> <8761nusb90.fsf@uwakimon.sk.tsukuba.ac.jp> <87vbkovhh7.fsf@engster.org> <87387rvobr.fsf@engster.org> <87y4p9885b.fsf@fx.delysid.org> <87387hrs71.fsf@engster.org> <87y4p7rgf5.fsf@engster.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1421095330 8582 80.91.229.3 (12 Jan 2015 20:42:10 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 12 Jan 2015 20:42:10 +0000 (UTC) Cc: emacs-devel@gnu.org To: Helmut Eller Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jan 12 21:42:03 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YAloI-0006fJ-MF for ged-emacs-devel@m.gmane.org; Mon, 12 Jan 2015 21:42:02 +0100 Original-Received: from localhost ([::1]:36239 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YAloE-0002Pp-Mn for ged-emacs-devel@m.gmane.org; Mon, 12 Jan 2015 15:41:58 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:60190) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YAloA-0002PC-AY for emacs-devel@gnu.org; Mon, 12 Jan 2015 15:41:55 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YAlo6-0001mi-6o for emacs-devel@gnu.org; Mon, 12 Jan 2015 15:41:54 -0500 Original-Received: from randomsample.de ([5.45.97.173]:42950) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YAlo5-0001mJ-RR for emacs-devel@gnu.org; Mon, 12 Jan 2015 15:41:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=randomsample.de; s=a; h=Content-Type:MIME-Version:Message-ID:Date:References:In-Reply-To:Subject:Cc:To:From; bh=LDGQdUztM22WB58s525SxorxIgXXiaTd9UMiugZ9gXs=; b=mqk7T4HoagYCTofC424zTjCeNMNDwzXepK2jopRG1s6O4ORSyYIhsp7wfdmcNpy8EsFc8+XENf0S9vgDJ6yzUmEbOq1IfLFMKdKrK1NW75BJCldE4G6fhPeZDg+DcznW; Original-Received: from ip4d154cb9.dynamic.kabel-deutschland.de ([77.21.76.185] helo=spaten) by randomsample.de with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1YAlo4-0001D5-Ga; Mon, 12 Jan 2015 21:41:48 +0100 In-Reply-To: (Helmut Eller's message of "Mon, 12 Jan 2015 21:01:53 +0100") User-Agent: Gnus/5.13001 (Ma Gnus v0.10) Emacs/24.3.91 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 5.45.97.173 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:181194 Archived-At: Helmut Eller writes: > On Mon, Jan 12 2015, David Engster wrote: >> The first step would have been to replace our existing C++ parser with >> the AST that is produced by GCC. The plugin would output the same LISP >> structures that Semantic uses. > > I'm a bit confused because at one side you seem to say that certain > things are not possible with plugins but at the other side you seem to > think that plugins can dump enough information to make these things > possible. I'm not sure how familiar you are with CEDET. We already have the infrastructure to parse local expressions and calculate completions based on a database of "tags", which are structures generated from our own C++ parser. At a first step, I wanted to replace only the parser, meaning the part which creates the AST. The actual "rules" of C++ are coded in Semantic (to various degree). >> My work so far was mainly to investigate >> how C++ types are actually stored in the AST. Especially the template >> stuff is pretty weird, and documentation is sparse. Fortunately, the >> headers are pretty well commented, but it still involves a *lot* of >> trial and error. > > I can imagine that templates are complicated. I tried to implement a > find-definition command as a GCC plugin. My first approach was to > search the smallest subtree that contains a particular source location. > That didn't work out because GCC doesn't record "source ranges" so it's > difficult to know if a tree covers a particular location. Another > problem is that identifiers are resolved early eg. "x + y" produces a > PLUS_EXPR (with the source location pointing to the + sign) but the > arguments are pointers to the VAR_DECLs of x and y and the source > location of those VAR_DECLs is typically a few lines earlier. > > In a second attempt I made Emacs insert a custom #pragma at the place > where we want to search for a definition; similar to the gccsense > approach. Plugins can register pragmas and that way have access to the > lexer. That kinda works but the problem is that pragmas are only > allowed in certain places (eg. at the end of a statement) and Emacs has > to guess where those places are. Indeed, the main difficulty here is to find the correct location in the AST when you only have line/column information. But when using CEDET, your source file will already be parsed, so Semantic has type information for your symbols, meaning it would already know the types from "x" and "y". You could directly ask the GCC plugin for the definition of that actual type (it probably wouldn't even have to call the plugin, because Semantic has a database for types). >> The actual "semantic" part of parsing C++ would still be handled by >> Emacs' Semantic package. For instance, it would calculate >> completions. So obviously, those completions wouldn't match those from >> libclang w.r.t. to accuracy, but they would be *much* better than they >> are now, especially because the preprocessor is already handled, which >> is currently one of Semantic's main problems. Also, type inference would >> already be done by GCC, so you would see the resulting type from 'auto' >> and such. > > Is the idea is to let GCC output some "global" information like type > declarations to enable better "local" parsing of function bodies in > Emacs? Or do you want to do pretty much all parsing in GCC? Here's how Semantic currently does it: when you load a file, it will first parse only declarations and function signatures, so something like a "shallow parse" (with depth=0). When you put your cursor in a function, it calls the parser again and asks him to only parse the function's body, which would be a depth=1 parse. I wanted to try to do a similar thing with the GCC plugin; that means, by default it would do a "shallow parse", skipping things like function bodies and only output their signatures. Then later, you would pass parameters like a function's name for which you'd like the detailed AST, or only things like local variables or similar. >> My plan was also to make this plugin usable for other tools. That means, >> it should not only output LISP structures, but alternatively also JSON >> and possibly XML. For instance, an external tool could build a symbol >> database for providing references. This could also serve as a starting >> point for doing refactoring. For more complicated tasks, the plugin >> could provide an AST matcher which you can query with certain >> expressions. > > In general I think the heavy lifting should be done in GCC+plugin and > Emacs should only do the "easy" stuff like displaying the result. But > for performance and other reasons it might be necessary to do at least > some parsing in Emacs too. Again, I'm not very familiar with GCC, which is why I wanted to do as much in Elisp as possible, meaning to re-use as much from CEDET as possible. My primary goal was to make the C++ parser more accurate and faster. -David