unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: "Robert Thorpe" <rthorpe@realworldtech.com>
Subject: Re: State-machine based syntax highlighting
Date: 8 Dec 2006 08:17:19 -0800	[thread overview]
Message-ID: <1165594639.164156.286660@f1g2000cwa.googlegroups.com> (raw)
In-Reply-To: <1165567010.036350.185800@n67g2000cwd.googlegroups.com>

spamfilteraccount@gmail.com wrote:
> Tim X wrote:
> >
> > The problem with parse based analysis is that you need an in-built
> > parser for all the languages that the editor is used to develop in and
> > this is not a trivial task. I suspect some sort of plugin architecture
> > that is able to use stand-alone parses for some language of interest
> > would probably be the way to go as it is unlikely even a small subset
> > of the languages devleoped within an emacs environment can have a
> > parser developed in elisp which is readily maintained.
>
> I think too that some kind of bridge or plugin architecture is the
> answer.
>
> Lots of languages provide access to syntax trees in some form (python,
> java, etc.), so it would be much simpler to use their native
> implementation than reinveinting everything in elisp.

That isn't really appropriate though.

Consider the following.  When I open a project I generally open all
files in the directory by doing something like C-x C-x project_foo/*.c
.  I also use save-places, so point appears in each file wherever I
left it last.  I think both of these are quite common ways to use
Emacs.

Doing this with normal parsing technology is difficult.  If the editor
just feeds every file into the external parser then back into the
editor then this will be a lot of work.  It would be similar to the
work of a compiler doing a full rebuild.  In fact it would be less
because parsing for font-locking involves nothing similar to compiler
optimization or code generation.  But it would still be a big task.  A
much better strategy is to start parsing at point in each file and only
parse a screenful at a time, doing this with an external parser would
be very hard.

There are other problems.  What if a part of the code is incorrect?
Imagine, in C for example, if a function were written "foo (;" on line
10.  The effect of the error would propagate down far away from where
it occurs, even line 300 might be treated wrongly.  The parser would
have to cope with this eventuality.

Also, in many languages there are bits of the meaning that depend on
the names used.  In C for example the code " (foo) (bar)" means
something different if foo is a type than it does if it's an
identifier.  The C compiler can cope with this because it tracks all
typedefs and identifiers through not only the current file but those
included in it with #include.   The only way for a font-lock system
based on a normal parser to deal with this situation would be for it to
read all the include files, which may not even be present.

Compiler parsers and font-locking/navigating code have different
intentions.  Compiler parsers must be fast when handling a whole file,
and they must generate accurate error messages.  Font-locking code must
be fast when starting at any arbitrary part of the code, and it must
tolerate incomplete information and errors.

  reply	other threads:[~2006-12-08 16:17 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-07  6:14 State-machine based syntax highlighting spamfilteraccount
2006-12-07 10:53 ` Robert Thorpe
2006-12-07 11:56   ` spamfilteraccount
2006-12-07 12:42     ` Robert Thorpe
2006-12-07 14:27       ` spamfilteraccount
2006-12-07 14:39         ` Robert Thorpe
2006-12-07 17:02           ` spamfilteraccount
2006-12-07 17:42             ` Stefan Monnier
     [not found]             ` <mailman.1644.1165513359.2155.help-gnu-emacs@gnu.org>
2006-12-07 18:35               ` spamfilteraccount
2006-12-07 18:57                 ` Robert Thorpe
2006-12-07 20:24                   ` Perry Smith
2006-12-08  7:33                   ` spamfilteraccount
2006-12-08  8:10                     ` Tim X
2006-12-08  8:36                       ` spamfilteraccount
2006-12-08 16:17                         ` Robert Thorpe [this message]
2006-12-08 21:14                           ` spamfilteraccount
2006-12-09  2:08                             ` Stefan Monnier
2006-12-09  2:06                           ` Stefan Monnier
2006-12-09  3:24                             ` Lennart Borgman
2006-12-08 13:14                       ` Leo
2006-12-08 14:00                       ` Robert Thorpe
2006-12-09  2:10                         ` Stefan Monnier
     [not found]                       ` <mailman.1672.1165586758.2155.help-gnu-emacs@gnu.org>
2006-12-08 14:17                         ` Robert Thorpe
2006-12-08 21:17                       ` spamfilteraccount
     [not found]                   ` <mailman.1653.1165523111.2155.help-gnu-emacs@gnu.org>
2006-12-08 10:01                     ` Robert Thorpe
2006-12-07 19:02                 ` Stefan Monnier
2006-12-07 19:29                   ` spamfilteraccount
2006-12-08 14:43                     ` Robert Thorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1165594639.164156.286660@f1g2000cwa.googlegroups.com \
    --to=rthorpe@realworldtech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).