all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Lynn Winebarger <owinebar@gmail.com>
To: Yuan Fu <casouri@gmail.com>
Cc: "Augustin Chéneau (BTuin)" <btuin@mailo.com>, emacs-devel@gnu.org
Subject: Re: Questions about tree-sitter
Date: Wed, 6 Sep 2023 12:11:24 -0400	[thread overview]
Message-ID: <CAM=F=bAwuhmzysqQUVYUuMDo1mq=K2O4BiZm-pOh+LYjJF774A@mail.gmail.com> (raw)
In-Reply-To: <C3EFD02D-F02F-4BE8-A6F4-A2506A9EFC90@gmail.com>

On Wed, Aug 30, 2023 at 3:03 AM Yuan Fu <casouri@gmail.com> wrote:
> > On Aug 29, 2023, at 2:26 PM, Augustin Chéneau (BTuin) <btuin@mailo.com> wrote:
> > I have a few questions about tree-sitter.
> >
> > I'm currently developing a grammar for GNU Bison alongside a tree-sitter
> > major mode, it's a work in progress.  The grammar is here:
> > <https://gitlab.com/btuin2/tree-sitter-bison>, still incomplete but so
> > far able to parse simple files, and the major mode prototype is
> > attached to this message.
> >
> > So, the questions:
> >
> > 1. Is there a way to reload a grammar?
> >
> > Emacs is pretty nice as a playground for testing grammars, but once a
> > grammar is loaded, it won't be loaded again until Emacs restarts (as far
> > as I know).
> > Is it possible to reload a grammar after modifying it?
>
> No, and it’s probably not easy to implement either, since unloading the grammar would require Emacs to purge/invalid all the node/query/parsers using that grammar.

Reviewing some generated "parser.c" files, and some of the available
documentation, it appears the parser.c file basically creates a lexing
function that adheres to a certain protocol in terms of
producing/consuming a standard lexer state data structure, and an
LR(1) parser table suitable for GLR parsing (i.e. allows ambiguous
actions).  These and definitions of the tokens and grammar symbols are
bundled up in a language structure passed to the tree-sitter library.
LALR(1) tables are essentially simplified/compressed LR(1) tables, and
emacs has code to calculate such tables directly in elisp.
Therefore, given functionality to translate elisp data into the raw C
structures, we should be able to dynamically create language data
structures to pass to the tree-sitter library to create a library.
We would also need a table driven lexer framework in place of the
generated lexer in the C file to completely avoid going through a C
compiler.
The other novel features of tree-sitter parsers appear to be
implemented in the parser runtime, not in the table calculation.

I've implemented LALR(1) parser generators two or three times in the
last couple of decades, this might be a fun project for me while I am
unambiguously able to contribute to GNU Emacs.

Regards,
Lynn



  parent reply	other threads:[~2023-09-06 16:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-29 21:26 Questions about tree-sitter Augustin Chéneau (BTuin)
2023-08-30  7:03 ` Yuan Fu
2023-08-30 11:28   ` Augustin Chéneau (BTuin)
2023-09-06  4:07     ` Yuan Fu
2023-09-08 11:53       ` Augustin Chéneau (BTuin)
2023-09-08 16:43         ` Yuan Fu
2023-09-09 16:39           ` Augustin Chéneau (BTuin)
2023-09-12  0:22             ` Yuan Fu
2023-09-13 12:43               ` Augustin Chéneau (BTuin)
2023-09-14  4:11                 ` Yuan Fu
2023-09-18 17:04                   ` Augustin Chéneau (BTuin)
2023-09-19  4:00                     ` Yuan Fu
2023-09-01  2:39   ` Madhu
2023-09-01  6:53     ` Eli Zaretskii
2023-09-01  9:15       ` Madhu
2023-09-01 10:45         ` Dmitry Gutov
2023-09-01 10:58         ` Eli Zaretskii
2023-11-27  7:16           ` Madhu
2023-09-06 16:11   ` Lynn Winebarger [this message]
2023-09-07 23:42     ` Yuan Fu
2023-09-08  0:11       ` Lynn Winebarger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM=F=bAwuhmzysqQUVYUuMDo1mq=K2O4BiZm-pOh+LYjJF774A@mail.gmail.com' \
    --to=owinebar@gmail.com \
    --cc=btuin@mailo.com \
    --cc=casouri@gmail.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.