Re: Parsing - Ian Grant

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

From: Ian Grant <ian.a.n.grant@googlemail.com>
To: Stefan Israelsson Tampe <stefan.itampe@gmail.com>
Cc: guile-devel <guile-devel@gnu.org>
Subject: Re: Parsing
Date: Mon, 15 Sep 2014 15:59:34 -0400	[thread overview]
Message-ID: <CAKFjmdyWTMwcmGwGejEoAfUtM5=mN2heHewGRH=N18rzJPuBjg@mail.gmail.com> (raw)
In-Reply-To: <CAGua6m25p9ufeZYi_3Z9WeUZDiQ_mADh-zik=U-4SHD-5xchQA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3220 bytes --]

On Mon, Sep 15, 2014 at 2:29 PM, Stefan Israelsson Tampe <
stefan.itampe@gmail.com> wrote:

> i find that logic programs together with memoization and auto cut ideoms
> helps in making powerful parsers, really fun at least.
>

You should definitely look at Tom's paper then, the "director's cut" is
here:

   http://www.tom-ridge.com/doc/ridge14sle_extended.pdf

He explicitly mentions the analogy with cut, and the parsers are memoised.
He's a theorem prover by trade, and he started this by producing
machine-checked formal proofs in HOL4 of the soundness and completeness of
combinator parsers,

I implemented a python parser
> using those [...] you my think that it will backtrack horribly but thanks
> to memoization it will not and gives the AST quite quickly.
>

No, I believe you! And you should see what you can do with ambiguous
grammars. Ridge gives some crazy Ackermann-like numbers. The parser's cache
can be used as a compact representation of the otherwise hyper-exponential
size of the ASTs an ambiguous grammar generates.

"For example, for input length 19,there are 441152315040444150 parse trees,
but as with most exponential behaviours it is not feasible to actually
compute all these parse trees. Now let us consider the following parser,
which is identical except that it computes (in all possible ways) the
length of the input parsed [...] Naively we might expect that this also
exhibits exponential behaviour, since presumably the parse trees must all
be generated, and the actions applied. This expectation is wrong. Running
this example parser on an input of size 19 returns in 0.02 seconds with a
single result 19. For an input of size 100, this parser returns a single
result 100 in 5 seconds, and over a range of inputs this parser exhibits
polynomial behaviour rather than exponential behaviour"

This amused me, because I implemented the earlier (pre-oracle) parser he
described in Standard ML, and I did something which I thought later was a
bit stupid: I executed all the semantic actions _during the parse_ and
cached the semantic results in the "memo-pad" along with the syntactic
derivation. It wasn't so slow, because of the memoization. But I am
wondering now whether there is not some application to proof-search that
one could use this technique on. A parse is a derivation of a kind, so with
the right kind of grammar it could be a derivation of some sort of proof,
and if that were an ambiguous grammar then it could explore
non-deterministic relations, like a Prolog interpreter does.

I am talking out of my hat again, but maybe someone who has a clearer idea
of these things will see a spark amongst all the smoke. But I hope no-one
would spend a year or something looking for a way to prove P=NP because of
what I'd said. (Unless they actually proved P=NP, which would be very cool!)

Oh, guile-log is based on C code that pretty restrictive in the kind of cpu
> you can use, basically only amd64 atm. That can be changed but there is
> really not much interest
> in this software with more than a nicel little playground for this little
> schemer.
>

Is it a lot of C? Maybe you could replace that with lightning assembler,
and then it would run on a lot of hardware?

Ian

[-- Attachment #2: Type: text/html, Size: 4178 bytes --]

next prev parent reply	other threads:[~2014-09-15 19:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-15 17:36 Parsing Ian Grant
2014-09-15 18:29 ` Parsing Stefan Israelsson Tampe
2014-09-15 19:59   ` Ian Grant [this message]
2014-09-15 20:20     ` Parsing Ian Grant
2014-09-15 20:38     ` Parsing Stefan Israelsson Tampe
2014-09-17  0:10       ` Parsing Ian Grant
2014-09-15 20:45     ` Parsing Ian Grant
2014-10-09 23:20 ` Parsing Mark H Weaver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKFjmdyWTMwcmGwGejEoAfUtM5=mN2heHewGRH=N18rzJPuBjg@mail.gmail.com' \
    --to=ian.a.n.grant@googlemail.com \
    --cc=guile-devel@gnu.org \
    --cc=stefan.itampe@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).