all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Lynn Winebarger <owinebar@gmail.com>
To: Alexandr Karbivnichyi <ambulajan@gmail.com>
Cc: Eric Ludlam <ericludlam@gmail.com>, emacs-devel <emacs-devel@gnu.org>
Subject: Re: Using the wisent parser-generator, as it creates faster parsers
Date: Tue, 27 Dec 2022 22:22:37 -0500	[thread overview]
Message-ID: <CAM=F=bBACw+gv_yEtiUWoWDgZq9N6zM9Zh+B59UdJCxW=xpK-w@mail.gmail.com> (raw)
In-Reply-To: <27990365a2b32aaa3f76fe8f0922a8481d953ef6.camel@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2173 bytes --]

On Mon, Dec 26, 2022, 1:13 PM Alexandr Karbivnichyi <ambulajan@gmail.com>
wrote:

> On Mon, 2022-12-26 at 08:54 -0500, Lynn Winebarger wrote:
> > On Sun, Dec 25, 2022, 11:03 PM <ambulajan@gmail.com> wrote:
> >
> > I don't agree - conflicts detected by the parser generator indicate
> > that distinct ASTs map to the same text, at least according to the
> > reduction method being used.  I like using LR derivations (including
> > ones produced by LALR) because of the bottom-up recognition of
> > grammar symbols.
>
> True for LALR parsers. Tried-and-tested system, for industrial
> compilers built on a formal standard with simple grammar that doesn't
> change often.
>
I suppose it depends on the tools you are using.  I tend to roll my own
implementation of the LALR(1) construction of DeRemer and Penello, then use
the relations they define to determine how strings that will be ambiguously
parsed by the grammar. Then that ambiguity can be resolved. It's not that
different from what you describe doing at run-time with GLR, just
performing the analysis ahead of time (on LR(0) states, aka item sets) and
encoding the disambiguation logic into the grammar itself.  The
DeRemer-Penello construction is just a control-flow analysis.  I'm not
familiar with any work that uses it to automatically construct test cases
that illustrate ambiguities in grammars, but it's surprising if there isn't
given the tools for test case generation for programs in general purpose
programming languages.
There's then the second question of how to map productions of the
disambiguated grammar back to an AST data structure that is much
"flatter".  Usually the grammar for that AST data structure is the origin
of ambiguities in the concrete syntax.

It's a problem in noncommutative algebra.  The ambiguous grammar presumably
specifies the language (strings of terminal sbols or tokens) being
recognized correctly.  A CFG is a system of equations, and a parse is a
particular solution to those equations for a given element in that
language.  It's just a matter of finding an equivalent system of equations
such that every solution is unique.  At least, that formulation helps me.

Lynn

[-- Attachment #2: Type: text/html, Size: 2843 bytes --]

      reply	other threads:[~2022-12-28  3:22 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-26  4:02 Using the wisent parser-generator, as it creates faster parsers ambulajan
2022-12-26 13:54 ` Lynn Winebarger
2022-12-26 18:13   ` Alexandr Karbivnichyi
2022-12-28  3:22     ` Lynn Winebarger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM=F=bBACw+gv_yEtiUWoWDgZq9N6zM9Zh+B59UdJCxW=xpK-w@mail.gmail.com' \
    --to=owinebar@gmail.com \
    --cc=ambulajan@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=ericludlam@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.