From: Miles Bader <miles@gnu.org>
To: eric@siege-engine.com
Cc: Daniel Colascione <danc@merrillpress.com>,
David Engster <deng@randomsample.de>,
Daniel Colascione <danc@merrillprint.com>,
Lennart Borgman <lennart.borgman@gmail.com>,
emacs-devel@gnu.org, Stefan Monnier <monnier@IRO.UMontreal.CA>,
Steve Yegge <stevey@google.com>,
Deniz Dogan <deniz.a.m.dogan@gmail.com>, Leo <sdl.web@gmail.com>
Subject: Re: "Font-lock is limited to text matching" is a myth
Date: Wed, 12 Aug 2009 15:43:10 +0900 [thread overview]
Message-ID: <buows59zg5d.fsf@dhlpc061.dev.necel.com> (raw)
In-Reply-To: <1250043413.6753.457.camel@projectile.siege-engine.com> (Eric M. Ludlam's message of "Tue, 11 Aug 2009 22:16:53 -0400")
"Eric M. Ludlam" <eric@siege-engine.com> writes:
> As far as how to define tables for a parsing system written in C, an
> old-school solution is to just use the flex/bison engines under the
> Emacs Lisp API. There are a lot of new parser generator systems
> though, and I don't really know what the best one might be.
>
> One of the hairier parts of the CEDET parser is the lexical analyzer.
Slightly off-topic, but I'm a huge fan of "LPeg" [1], which is a
pattern-matching library for Lua, based on Parsing Expression Grammars
(PEGs).
I've always wished for something like LPeg in elisp, and since Lua is at
heart quite lisp-like (despite the very different syntax), I think it
could work very well. Maybe it wouldn't be too hard to adapt LPeg's
core to elisp (it's licensed under the BSD license).
[There's a popular implementation technique for PEGs called "packrat
parsers", and many PEG libraries use that technique -- however
apparently packrat parsers have some serious problems in practice, so
LPeg uses a different technique. See [2] for a discussion of this, and
of the LPeg implementation in detail.]
Some nice things about LPeg:
(1) It's very fast.
(2) It's very concise; for typical usage, it's essentially like
writing a parser in yacc or whatever.
(3) It makes it trivial to insert code and hooks at any point in the
parse; not just "actions", but code that can determine how the
parsing happens. This give a _huge_ amount of flexibility.
(4) It's very easy to "think about", despite the flexibility and
presence of arbitrary code driving parsing, because it works kind
of like a recursive descent parser, operating greedily (but
provides mechanisms to do automatic backtracking when necessary).
(5) Because it's so fast and flexible, typical practice is to _not_
have a separate lexical analyzer, but just do lexical analysis in
the parser. This easier and more convenient, and also makes it
easier to use parser information in lexical analysis (e.g., the
famous "typedef" vs. "id" issue in C parsers).
(6) It's very small -- the entire implementation (core engine and Lua
interface) is only 2000 lines of C.
[The standard way to use LPeg in Lua uses Lua's ability to easily
overload standard operators, giving them LPeg-specific meanings when
invoked on first-class "pattern" objects. That can't be done in elisp,
but I think a more lispy approach should be easy.]
[1] http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html
[2] http://www.inf.puc-rio.br/~roberto/docs/peg.pdf
-Miles
--
Zeal, n. A certain nervous disorder afflicting the young and inexperienced.
next prev parent reply other threads:[~2009-08-12 6:43 UTC|newest]
Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-09 23:34 Why js2-mode in Emacs 23.2? Deniz Dogan
2009-08-09 23:38 ` Lennart Borgman
2009-08-09 23:46 ` Daniel Colascione
2009-08-09 23:50 ` Deniz Dogan
2009-08-09 23:56 ` Lennart Borgman
2009-08-09 23:56 ` Daniel Colascione
2009-08-09 23:55 ` Lennart Borgman
2009-08-09 23:58 ` Daniel Colascione
2009-08-10 0:00 ` Lennart Borgman
2009-08-10 0:06 ` Daniel Colascione
2009-08-10 0:17 ` Lennart Borgman
2009-08-10 0:46 ` Daniel Colascione
2009-08-10 0:55 ` Lennart Borgman
2009-08-10 0:18 ` Leo
2009-08-10 0:49 ` Daniel Colascione
2009-08-10 7:06 ` Carsten Dominik
2009-08-10 8:44 ` Leo
2009-08-10 8:54 ` CHENG Gao
2009-08-10 9:26 ` Leo
2009-08-10 10:22 ` Richard Riley
2009-08-10 15:21 ` eval-after-load not harmful after all (Was: Re: Why js-2mode?) Daniel Colascione
2009-08-10 17:01 ` Drew Adams
2009-08-10 17:21 ` eval-after-load not harmful after all Stefan Monnier
2009-08-11 0:43 ` eval-after-load not harmful after all (Was: Re: Why js-2mode?) Stephen J. Turnbull
2009-08-11 0:46 ` Drew Adams
2009-08-11 14:06 ` Stephen J. Turnbull
2009-08-11 15:08 ` eval-after-load not harmful after all Stefan Monnier
2009-08-16 21:43 ` Leo
2009-08-17 0:34 ` Lennart Borgman
2009-08-17 11:44 ` Leo
2009-08-17 11:55 ` Lennart Borgman
2009-08-17 12:26 ` Leo
2009-08-17 14:40 ` Lennart Borgman
2009-08-11 0:53 ` eval-after-load not harmful after all (Was: Re: Why js-2mode?) Lennart Borgman
2009-08-11 3:06 ` Daniel Colascione
2009-08-11 9:17 ` Leo
2009-08-11 14:37 ` Stephen J. Turnbull
2009-08-10 10:41 ` Why js2-mode in Emacs 23.2? Carsten Dominik
2009-08-10 13:04 ` Leo
2009-08-10 14:55 ` Stefan Monnier
2009-08-11 1:13 ` Glenn Morris
2009-08-11 3:02 ` Daniel Colascione
2009-08-11 4:28 ` Dan Nicolaescu
2009-08-11 4:33 ` Daniel Colascione
2009-08-11 4:39 ` Dan Nicolaescu
2009-08-11 4:45 ` Daniel Colascione
2009-08-11 4:37 ` Glenn Morris
2009-08-10 2:47 ` Stefan Monnier
2009-08-10 2:55 ` Lennart Borgman
2009-08-10 13:12 ` Stefan Monnier
2009-08-10 0:32 ` Leo
2009-08-10 0:48 ` Daniel Colascione
2009-08-10 2:55 ` Stefan Monnier
2009-08-10 3:24 ` Miles Bader
2009-08-10 3:27 ` Lennart Borgman
2009-08-10 3:45 ` Daniel Colascione
2009-08-10 5:18 ` Jason Rumney
2009-08-10 5:51 ` Xah Lee
2009-08-10 6:22 ` Xah Lee
2009-08-10 6:59 ` Miles Bader
2009-08-10 11:01 ` Lennart Borgman
2009-08-10 17:35 ` "Font-lock is limited to text matching" is a myth Daniel Colascione
2009-08-10 18:04 ` Lennart Borgman
2009-08-10 20:42 ` David Engster
2009-08-10 20:51 ` Lennart Borgman
2009-08-10 22:06 ` Eric M. Ludlam
2009-08-10 22:19 ` Lennart Borgman
2009-08-11 1:50 ` Eric M. Ludlam
2009-08-11 6:47 ` Steve Yegge
2009-08-11 9:17 ` Miles Bader
2009-08-11 12:13 ` Daniel Colascione
2009-08-11 14:37 ` Miles Bader
2009-08-11 14:49 ` Lennart Borgman
2009-08-11 14:57 ` Daniel Colascione
2009-08-11 14:53 ` Daniel Colascione
2009-08-11 15:08 ` Lennart Borgman
2009-08-11 15:36 ` Miles Bader
2009-08-11 15:56 ` Stephen J. Turnbull
2009-08-11 15:54 ` Lennart Borgman
2009-08-11 17:00 ` Stephen J. Turnbull
2009-08-11 17:19 ` Lennart Borgman
2009-08-11 15:57 ` Miles Bader
2009-08-11 17:06 ` Stephen J. Turnbull
2009-08-11 14:50 ` Chong Yidong
2009-08-11 15:06 ` Daniel Colascione
2009-08-11 15:11 ` Lennart Borgman
2009-08-11 15:16 ` Daniel Colascione
2009-08-11 15:44 ` Lennart Borgman
2009-08-11 18:04 ` joakim
2009-08-11 18:08 ` Lennart Borgman
2009-08-11 19:12 ` joakim
2009-08-11 17:09 ` Stefan Monnier
2009-08-11 16:04 ` Stefan Monnier
2009-08-11 18:10 ` Edward O'Connor
2009-08-12 1:58 ` Steve Yegge
2009-08-12 13:48 ` Chong Yidong
2009-08-12 16:07 ` Lennart Borgman
2009-08-12 22:08 ` Steve Yegge
2009-08-14 1:22 ` Stefan Monnier
2009-08-12 2:16 ` Eric M. Ludlam
2009-08-12 6:43 ` Miles Bader [this message]
2009-08-12 11:28 ` Xah Lee
2010-11-23 14:43 ` Stefan Monnier
2009-08-12 15:21 ` asynchronous parsing (was: "Font-lock is limited to text matching" is a myth) Ted Zlatanov
2009-08-12 17:16 ` asynchronous parsing joakim
2009-08-12 19:39 ` Ted Zlatanov
2009-08-12 20:01 ` joakim
2009-08-13 2:51 ` Stefan Monnier
2009-08-13 14:51 ` Ted Zlatanov
2009-08-11 19:48 ` "Font-lock is limited to text matching" is a myth Lennart Borgman
2009-08-10 18:47 ` Stefan Monnier
2009-08-10 18:55 ` Lennart Borgman
2009-08-11 3:33 ` Stefan Monnier
2009-08-10 14:49 ` Why js2-mode in Emacs 23.2? Stefan Monnier
2009-08-10 6:46 ` Deniz Dogan
2009-08-10 14:53 ` Stefan Monnier
2009-08-10 14:05 ` Stephen Eilert
2009-08-10 14:37 ` Lennart Borgman
2009-08-10 14:42 ` Deniz Dogan
2009-08-10 19:12 ` Stephen Eilert
2009-08-10 14:41 ` Deniz Dogan
2009-08-10 14:57 ` Lennart Borgman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=buows59zg5d.fsf@dhlpc061.dev.necel.com \
--to=miles@gnu.org \
--cc=danc@merrillpress.com \
--cc=danc@merrillprint.com \
--cc=deng@randomsample.de \
--cc=deniz.a.m.dogan@gmail.com \
--cc=emacs-devel@gnu.org \
--cc=eric@siege-engine.com \
--cc=lennart.borgman@gmail.com \
--cc=monnier@IRO.UMontreal.CA \
--cc=sdl.web@gmail.com \
--cc=stevey@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).