unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Xah Lee <xahlee@gmail.com>
To: Miles Bader <miles@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: "Font-lock is limited to text matching" is a myth
Date: Wed, 12 Aug 2009 04:28:51 -0700	[thread overview]
Message-ID: <aa6b5cbe0908120428n48156bc0l76c6335c2814dadf@mail.gmail.com> (raw)
In-Reply-To: <buows59zg5d.fsf@dhlpc061.dev.necel.com>

[-- Attachment #1: Type: text/plain, Size: 3428 bytes --]

i very much second this! PEG's the next level of regex, and i expect it to
replace regex in some sense in the coming years for the whole field of text
processing.

there are currently 2 of them in elisp as far as i know:

 * http://www.emacswiki.org/cgi-bin/wiki/ParserCompiler (2008) by Mike
Mattie.

 * http://www.emacswiki.org/emacs/ParsingExpressionGrammars (2008) by Helmut
Eller.

it'd be much better if PEG is integrated from the ground up in elisp,
possibly implemented in C or from other libs for speed. I imagine functions
that takes a regex can have a version with PEG.

  Xah

On Tue, Aug 11, 2009 at 11:43 PM, Miles Bader <miles@gnu.org> wrote:

> "Eric M. Ludlam" <eric@siege-engine.com> writes:
> > As far as how to define tables for a parsing system written in C, an
> > old-school solution is to just use the flex/bison engines under the
> > Emacs Lisp API.  There are a lot of new parser generator systems
> > though, and I don't really know what the best one might be.
> >
> > One of the hairier parts of the CEDET parser is the lexical analyzer.
>
> Slightly off-topic, but I'm a huge fan of "LPeg" [1], which is a
> pattern-matching library for Lua, based on Parsing Expression Grammars
> (PEGs).
>
> I've always wished for something like LPeg in elisp, and since Lua is at
> heart quite lisp-like (despite the very different syntax), I think it
> could work very well.  Maybe it wouldn't be too hard to adapt LPeg's
> core to elisp (it's licensed under the BSD license).
>
> [There's a popular implementation technique for PEGs called "packrat
> parsers", and many PEG libraries use that technique -- however
> apparently packrat parsers have some serious problems in practice, so
> LPeg uses a different technique.  See [2] for a discussion of this, and
> of the LPeg implementation in detail.]
>
> Some nice things about LPeg:
>
>  (1) It's very fast.
>
>  (2) It's very concise; for typical usage, it's essentially like
>      writing a parser in yacc or whatever.
>
>  (3) It makes it trivial to insert code and hooks at any point in the
>      parse; not just "actions", but code that can determine how the
>      parsing happens.  This give a _huge_ amount of flexibility.
>
>  (4) It's very easy to "think about", despite the flexibility and
>      presence of arbitrary code driving parsing, because it works kind
>      of like a recursive descent parser, operating greedily (but
>      provides mechanisms to do automatic backtracking when necessary).
>
>  (5) Because it's so fast and flexible, typical practice is to _not_
>      have a separate lexical analyzer, but just do lexical analysis in
>      the parser.  This easier and more convenient, and also makes it
>      easier to use parser information in lexical analysis (e.g., the
>      famous "typedef" vs. "id" issue in C parsers).
>
>  (6) It's very small -- the entire implementation (core engine and Lua
>      interface) is only 2000 lines of C.
>
> [The standard way to use LPeg in Lua uses Lua's ability to easily
> overload standard operators, giving them LPeg-specific meanings when
> invoked on first-class "pattern" objects.  That can't be done in elisp,
> but I think a more lispy approach should be easy.]
>
> [1] http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html
>
> [2] http://www.inf.puc-rio.br/~roberto/docs/peg.pdf
>
> -Miles
>
> --
> Zeal, n. A certain nervous disorder afflicting the young and inexperienced.
>
>
>

[-- Attachment #2: Type: text/html, Size: 4670 bytes --]

  reply	other threads:[~2009-08-12 11:28 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-09 23:34 Why js2-mode in Emacs 23.2? Deniz Dogan
2009-08-09 23:38 ` Lennart Borgman
2009-08-09 23:46   ` Daniel Colascione
2009-08-09 23:50     ` Deniz Dogan
2009-08-09 23:56       ` Lennart Borgman
2009-08-09 23:56       ` Daniel Colascione
2009-08-09 23:55     ` Lennart Borgman
2009-08-09 23:58       ` Daniel Colascione
2009-08-10  0:00         ` Lennart Borgman
2009-08-10  0:06           ` Daniel Colascione
2009-08-10  0:17             ` Lennart Borgman
2009-08-10  0:46               ` Daniel Colascione
2009-08-10  0:55                 ` Lennart Borgman
2009-08-10  0:18         ` Leo
2009-08-10  0:49           ` Daniel Colascione
2009-08-10  7:06           ` Carsten Dominik
2009-08-10  8:44             ` Leo
2009-08-10  8:54               ` CHENG Gao
2009-08-10  9:26                 ` Leo
2009-08-10 10:22                   ` Richard Riley
2009-08-10 15:21                   ` eval-after-load not harmful after all (Was: Re: Why js-2mode?) Daniel Colascione
2009-08-10 17:01                     ` Drew Adams
2009-08-10 17:21                       ` eval-after-load not harmful after all Stefan Monnier
2009-08-11  0:43                       ` eval-after-load not harmful after all (Was: Re: Why js-2mode?) Stephen J. Turnbull
2009-08-11  0:46                         ` Drew Adams
2009-08-11 14:06                           ` Stephen J. Turnbull
2009-08-11 15:08                           ` eval-after-load not harmful after all Stefan Monnier
2009-08-16 21:43                             ` Leo
2009-08-17  0:34                               ` Lennart Borgman
2009-08-17 11:44                                 ` Leo
2009-08-17 11:55                                   ` Lennart Borgman
2009-08-17 12:26                                     ` Leo
2009-08-17 14:40                                       ` Lennart Borgman
2009-08-11  0:53                         ` eval-after-load not harmful after all (Was: Re: Why js-2mode?) Lennart Borgman
2009-08-11  3:06                         ` Daniel Colascione
2009-08-11  9:17                           ` Leo
2009-08-11 14:37                           ` Stephen J. Turnbull
2009-08-10 10:41               ` Why js2-mode in Emacs 23.2? Carsten Dominik
2009-08-10 13:04                 ` Leo
2009-08-10 14:55                   ` Stefan Monnier
2009-08-11  1:13                 ` Glenn Morris
2009-08-11  3:02                   ` Daniel Colascione
2009-08-11  4:28                     ` Dan Nicolaescu
2009-08-11  4:33                       ` Daniel Colascione
2009-08-11  4:39                         ` Dan Nicolaescu
2009-08-11  4:45                           ` Daniel Colascione
2009-08-11  4:37                     ` Glenn Morris
2009-08-10  2:47         ` Stefan Monnier
2009-08-10  2:55           ` Lennart Borgman
2009-08-10 13:12             ` Stefan Monnier
2009-08-10  0:32   ` Leo
2009-08-10  0:48     ` Daniel Colascione
2009-08-10  2:55       ` Stefan Monnier
2009-08-10  3:24         ` Miles Bader
2009-08-10  3:27           ` Lennart Borgman
2009-08-10  3:45             ` Daniel Colascione
2009-08-10  5:18             ` Jason Rumney
2009-08-10  5:51           ` Xah Lee
2009-08-10  6:22             ` Xah Lee
2009-08-10  6:59               ` Miles Bader
2009-08-10 11:01             ` Lennart Borgman
2009-08-10 17:35             ` "Font-lock is limited to text matching" is a myth Daniel Colascione
2009-08-10 18:04               ` Lennart Borgman
2009-08-10 20:42                 ` David Engster
2009-08-10 20:51                   ` Lennart Borgman
2009-08-10 22:06                     ` Eric M. Ludlam
2009-08-10 22:19                       ` Lennart Borgman
2009-08-11  1:50                         ` Eric M. Ludlam
2009-08-11  6:47                           ` Steve Yegge
2009-08-11  9:17                             ` Miles Bader
2009-08-11 12:13                             ` Daniel Colascione
2009-08-11 14:37                               ` Miles Bader
2009-08-11 14:49                                 ` Lennart Borgman
2009-08-11 14:57                                   ` Daniel Colascione
2009-08-11 14:53                                 ` Daniel Colascione
2009-08-11 15:08                                   ` Lennart Borgman
2009-08-11 15:36                                   ` Miles Bader
2009-08-11 15:56                                 ` Stephen J. Turnbull
2009-08-11 15:54                                   ` Lennart Borgman
2009-08-11 17:00                                     ` Stephen J. Turnbull
2009-08-11 17:19                                       ` Lennart Borgman
2009-08-11 15:57                                   ` Miles Bader
2009-08-11 17:06                                     ` Stephen J. Turnbull
2009-08-11 14:50                               ` Chong Yidong
2009-08-11 15:06                                 ` Daniel Colascione
2009-08-11 15:11                                   ` Lennart Borgman
2009-08-11 15:16                                     ` Daniel Colascione
2009-08-11 15:44                                       ` Lennart Borgman
2009-08-11 18:04                                   ` joakim
2009-08-11 18:08                                     ` Lennart Borgman
2009-08-11 19:12                                       ` joakim
2009-08-11 17:09                               ` Stefan Monnier
2009-08-11 16:04                             ` Stefan Monnier
2009-08-11 18:10                               ` Edward O'Connor
2009-08-12  1:58                               ` Steve Yegge
2009-08-12 13:48                                 ` Chong Yidong
2009-08-12 16:07                                   ` Lennart Borgman
2009-08-12 22:08                                   ` Steve Yegge
2009-08-14  1:22                                 ` Stefan Monnier
2009-08-12  2:16                               ` Eric M. Ludlam
2009-08-12  6:43                                 ` Miles Bader
2009-08-12 11:28                                   ` Xah Lee [this message]
2010-11-23 14:43                                   ` Stefan Monnier
2009-08-12 15:21                               ` asynchronous parsing (was: "Font-lock is limited to text matching" is a myth) Ted Zlatanov
2009-08-12 17:16                                 ` asynchronous parsing joakim
2009-08-12 19:39                                   ` Ted Zlatanov
2009-08-12 20:01                                     ` joakim
2009-08-13  2:51                                 ` Stefan Monnier
2009-08-13 14:51                                   ` Ted Zlatanov
2009-08-11 19:48                           ` "Font-lock is limited to text matching" is a myth Lennart Borgman
2009-08-10 18:47               ` Stefan Monnier
2009-08-10 18:55                 ` Lennart Borgman
2009-08-11  3:33                   ` Stefan Monnier
2009-08-10 14:49           ` Why js2-mode in Emacs 23.2? Stefan Monnier
2009-08-10  6:46         ` Deniz Dogan
2009-08-10 14:53           ` Stefan Monnier
2009-08-10 14:05       ` Stephen Eilert
2009-08-10 14:37         ` Lennart Borgman
2009-08-10 14:42           ` Deniz Dogan
2009-08-10 19:12           ` Stephen Eilert
2009-08-10 14:41         ` Deniz Dogan
2009-08-10 14:57           ` Lennart Borgman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa6b5cbe0908120428n48156bc0l76c6335c2814dadf@mail.gmail.com \
    --to=xahlee@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=miles@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).