unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Stephen Leake <stephen_leake@stephe-leake.org>
To: haj@posteo.de (Harald Jörg)
Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
Subject: Re: Handling extensions of programming languages
Date: Tue, 30 Mar 2021 11:41:11 -0700	[thread overview]
Message-ID: <86ft0c4h88.fsf@stephe-leake.org> (raw)
In-Reply-To: <87blbc33tm.fsf@hajtower> ("Harald Jörg"'s message of "Sun, 21 Mar 2021 16:48:53 +0100")

haj@posteo.de (Harald Jörg) writes:

>> For indentation, it's fundamentally harder (for the same reason that
>> combining two LALR grammars doesn't necessarily give you an LALR
>> grammar), so it will have to be done in a somewhat ad-hoc way.
>
> Indeed.  Indentation needs more "context".

The Gnu ELPA package 'wisi' provides a way to declare indentation in the
grammar as actions; that provides all the context needed.

The wisi parsers also have excellent error correction, so the grammar
actions operate on a complete syntax tree (or fail utterly when the
input is really bad).

I have not tried to use wisi for Perl; it works for Ada and Java.

This does not address your issue of extending a language with new
syntax; as far as wisi is concerned, that is a new language, and needs
an entirely new grammar file. This is true for any LR parser.
It may not be true for a packrat parser, although the base parser would
have to provide hooks in each nonterminal parsing routine.

In wisi, it might be possible to extend the grammar file syntax with
something like:

#base_grammar <grammar file>

but it would still generate separate parsers for the base and extended
languages.

As long as the extended language is a superset of the base language, it
mostly doesn't hurt to always use the extended language parser. The
ada-mode parser implements a language that is an extension of standard
Ada 2012; that reduces conflicts and simplifies specifying indentation.

One downside of using an extended parser; it will not report syntax
errors for extended syntax in a file that is not supposed to contain
any. For ada-mode this is not a significant problem; the extensions
allow things that no Ada programmer would write even by mistake, and the
real compiler catches them soon enough.

> And as for indentation...  I'd say the code in both modes needs to catch
> up with current perl before we consider extensions.  Maybe they could
> share functions or regular expressions how to find the beginning of a
> function, or how to identify closing braces which terminate a statement:
> The specification for this logic comes from Perl and should be the same
> for both modes.

The reason I started the wisi package and WisiToken parser generator was
to migrate ada-mode away from ad-hoc code to grammar based code, to
support Ada 2012. To work well, the parser needs to be error correcting.
SMIE is inherently more error tolerant than an LR parser without error
correction, but I doubt it's good enough for indent.

-- 
-- Stephe



      parent reply	other threads:[~2021-03-30 18:41 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19 18:53 Handling extensions of programming languages Harald Jörg
2021-03-20 17:02 ` Matt Armstrong
2021-03-20 23:40   ` Harald Jörg
2021-03-21  2:18     ` Clément Pit-Claudel
2021-03-21 11:41       ` Harald Jörg
2021-03-21 12:39         ` Stefan Monnier
2021-03-21 15:48           ` Harald Jörg
2021-03-21 17:59             ` Stefan Monnier
2021-03-22 14:08               ` Handling extensions of programming languages (Perl) Harald Jörg
2021-03-22 14:48                 ` Stefan Monnier
2021-03-22 17:32                   ` Harald Jörg
2021-03-22 18:27                     ` Stefan Monnier
2021-03-22 19:31                       ` Harald Jörg
2021-03-22 19:58                         ` [OFFTOPIC] " Stefan Monnier
2021-03-22 22:05                           ` Harald Jörg
2021-03-22 22:24                             ` Stefan Monnier
2021-03-22 23:43                               ` Harald Jörg
2021-03-23  3:49                                 ` [OFFTOPIC] " Stefan Monnier
2021-03-30 18:41             ` Stephen Leake [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86ft0c4h88.fsf@stephe-leake.org \
    --to=stephen_leake@stephe-leake.org \
    --cc=emacs-devel@gnu.org \
    --cc=haj@posteo.de \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).