hi,

i find that logic programs together with memoization and auto cut ideoms
helps in making powerful parsers, really fun at least. I implemented a
python parser
using those and you my find some docs in subsections inside the guile-log
framework. I will try to make the parser combinators up to date in a few
weaks. To see
it in action you can glean on
https://gitorious.org/python-on-guile/python-on-guile/source/e32e72dfa7a09c4a791e49d816d52c483d12e5f6:modules/language/python/parser.scm
you my think that it will backtrack horribly but thanks to memoization it
will not and gives the AST quite quickly.

Oh, guile-log is based on C code that pretty restrictive in the kind of cpu
you can use, basically only amd64 atm. That can be changed but there is
really not much interest
in this software with more than a nicel little playground for this little
schemer.

Cheers!

On Mon, Sep 15, 2014 at 7:36 PM, Ian Grant <ian.a.n.grant@googlemail.com>
wrote:

> LALR parser generators are not the last word in parsing technology. Many
> unambiguous context-free languages are not well suited to LALR parsing. The
> C language is an exception: the C grammar from the ISO standard can be
> typed almost verbatim into a YACC/Bison file, and it will produce just the
> one well-documented shift/reduce conflict in the IF ELSE statement, which
> is fortuitously resolved the right way. But this is achieved at some cost,
> because the C grammar does not actually constrain the parsed statements to
> those which are well-formed. In addition to the grammar, the language
> definition includes a slew of extra conditions which need to be checked.
> These checks must be coded into a post-parsing phase, because they are not
> things that the LALR tables can encode.
>
> Important examples of grammars which are not LALR parsable include ASN.1
> and Standard ML. When Claudio Russo added higher order functors to Moscow
> ML, he apparently wasn't able to figure out how to get the LALR parser to
> parse the grammar for the language he'd invented. So the Moscow ML parser
> produces 37 shift/reduce conflicts. I don't think anyone actually knows how
> those higher order declarations are parsed in practice.
>
> But I think the problem of parsing has actually been solved. Tom Ridge's
> "p3" combinator parsers with oracles will parse any unambiguous CFG in
> polynomial time complexity at most O^3, or possibly O^2, I need to check,
> in the length of the input. The advantage of combinator parsers is that
> they parse top-down, in a way which follows the natural structure of the
> grammar. So errors can be explained to the user (and the developer!) in
> terms of the actual derivations (parsing decisions) taken on the input,
> This is not the case with bottom-up parsers such as LALR.
>
> It would be really good to have a p3-like parser generator working in
> Guile. The only technical part is integrating the oracle with the
> combinator parsers. Ridge's example uses an Earley parser, but he points
> out that any parser generator could be used as the oracle. So perhaps LALR
> parsers would do. The grammar the oracles need to be able to parse is much
> simpler than the general CFG's the combinators work on: the oracles are
> just there to make the decisions about where to split the left-recursive
> productions. But if the LALR oracle doesn't work out, then it is easy to
> write an Early parser in scheme. See
> http://www.cavar.me/damir/charty/scheme/ which is LGPLed.
>
> See Ridge's papers at http://www.tom-ridge.com/parsing.html
>
> Again, one cold implement the recursive parser combinators in
> scheme-generated lightning. If this were done abstractly then Guile could
> generate fast machine code parsers for any host language.
>
> Ian
>
>
>