From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Robert Thorpe" Newsgroups: gmane.emacs.help Subject: Re: State-machine based syntax highlighting Date: 8 Dec 2006 08:17:19 -0800 Organization: http://groups.google.com Message-ID: <1165594639.164156.286660@f1g2000cwa.googlegroups.com> References: <1165472049.496117.320630@79g2000cws.googlegroups.com> <1165488825.132862.189340@79g2000cws.googlegroups.com> <1165492567.864982.59980@79g2000cws.googlegroups.com> <1165495364.560960.271250@f1g2000cwa.googlegroups.com> <1165501630.172348.157180@j72g2000cwa.googlegroups.com> <1165502373.932709.15860@79g2000cws.googlegroups.com> <1165510932.276718.251220@73g2000cwn.googlegroups.com> <1165516558.657188.21610@j44g2000cwa.googlegroups.com> <1165517838.526624.171950@f1g2000cwa.googlegroups.com> <1165563183.140436.326110@16g2000cwy.googlegroups.com> <87psaukdu1.fsf@lion.rapttech.com.au> <1165567010.036350.185800@n67g2000cwd.googlegroups.com> NNTP-Posting-Host: dough.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Trace: sea.gmane.org 1165596076 23927 80.91.229.10 (8 Dec 2006 16:41:16 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 8 Dec 2006 16:41:16 +0000 (UTC) Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Dec 08 17:41:16 2006 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by dough.gmane.org with esmtp (Exim 4.50) id 1Gsimc-0008Q6-A6 for geh-help-gnu-emacs@m.gmane.org; Fri, 08 Dec 2006 17:41:10 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Gsimb-0002J6-RJ for geh-help-gnu-emacs@m.gmane.org; Fri, 08 Dec 2006 11:41:09 -0500 Original-Path: shelby.stanford.edu!newsfeed.stanford.edu!postnews.google.com!f1g2000cwa.googlegroups.com!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 57 Original-NNTP-Posting-Host: 213.94.228.210 Original-X-Trace: posting.google.com 1165594644 17682 127.0.0.1 (8 Dec 2006 16:17:24 GMT) Original-X-Complaints-To: groups-abuse@google.com Original-NNTP-Posting-Date: Fri, 8 Dec 2006 16:17:24 +0000 (UTC) In-Reply-To: <1165567010.036350.185800@n67g2000cwd.googlegroups.com> User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6,gzip(gfe),gzip(gfe) X-HTTP-Via: 1.0 EMF3ASPROXY03 Complaints-To: groups-abuse@google.com Injection-Info: f1g2000cwa.googlegroups.com; posting-host=213.94.228.210; posting-account=hWoAPxMAAAAnBKSBz1ZivwUPPjEuve7bvVCHZQ8rhrluPfwcBJd92w Original-Xref: shelby.stanford.edu gnu.emacs.help:143754 Original-To: help-gnu-emacs@gnu.org X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:39356 Archived-At: spamfilteraccount@gmail.com wrote: > Tim X wrote: > > > > The problem with parse based analysis is that you need an in-built > > parser for all the languages that the editor is used to develop in and > > this is not a trivial task. I suspect some sort of plugin architecture > > that is able to use stand-alone parses for some language of interest > > would probably be the way to go as it is unlikely even a small subset > > of the languages devleoped within an emacs environment can have a > > parser developed in elisp which is readily maintained. > > I think too that some kind of bridge or plugin architecture is the > answer. > > Lots of languages provide access to syntax trees in some form (python, > java, etc.), so it would be much simpler to use their native > implementation than reinveinting everything in elisp. That isn't really appropriate though. Consider the following. When I open a project I generally open all files in the directory by doing something like C-x C-x project_foo/*.c . I also use save-places, so point appears in each file wherever I left it last. I think both of these are quite common ways to use Emacs. Doing this with normal parsing technology is difficult. If the editor just feeds every file into the external parser then back into the editor then this will be a lot of work. It would be similar to the work of a compiler doing a full rebuild. In fact it would be less because parsing for font-locking involves nothing similar to compiler optimization or code generation. But it would still be a big task. A much better strategy is to start parsing at point in each file and only parse a screenful at a time, doing this with an external parser would be very hard. There are other problems. What if a part of the code is incorrect? Imagine, in C for example, if a function were written "foo (;" on line 10. The effect of the error would propagate down far away from where it occurs, even line 300 might be treated wrongly. The parser would have to cope with this eventuality. Also, in many languages there are bits of the meaning that depend on the names used. In C for example the code " (foo) (bar)" means something different if foo is a type than it does if it's an identifier. The C compiler can cope with this because it tracks all typedefs and identifiers through not only the current file but those included in it with #include. The only way for a font-lock system based on a normal parser to deal with this situation would be for it to read all the include files, which may not even be present. Compiler parsers and font-locking/navigating code have different intentions. Compiler parsers must be fast when handling a whole file, and they must generate accurate error messages. Font-locking code must be fast when starting at any arbitrary part of the code, and it must tolerate incomplete information and errors.