unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Lynn Winebarger <owinebar@gmail.com>
To: Stephen Leake <stephen_leake@stephe-leake.org>
Cc: Gregory Heytings <gregory@heytings.org>,
	Philip Kaludercic <philipk@posteo.net>,
	 Yuan Fu <casouri@gmail.com>,
	Stefan Monnier <monnier@iro.umontreal.ca>,
	 Eli Zaretskii <eliz@gnu.org>, Dmitry Gutov <dgutov@yandex.ru>,
	Tim Cross <theophilusx@gmail.com>,
	emacs-devel <emacs-devel@gnu.org>
Subject: Re: [SPAM UNSURE] Re: Tree-sitter introduction documentation
Date: Thu, 29 Dec 2022 17:37:45 -0500	[thread overview]
Message-ID: <CAM=F=bAVpdcZV7uhTRkhih17SOea6JG+oegjCkZPy0LesRae=A@mail.gmail.com> (raw)
In-Reply-To: <86sfgxq3ph.fsf@stephe-leake.org>

[-- Attachment #1: Type: text/plain, Size: 2837 bytes --]

On Thu, Dec 29, 2022, 4:50 PM Stephen Leake <stephen_leake@stephe-leake.org>
wrote:

> Lynn Winebarger <owinebar@gmail.com> writes:
>
> > On Thu, Dec 29, 2022 at 10:28 AM Gregory Heytings <gregory@heytings.org>
> wrote:
> >> > Do you know how strong the dependency on node is?  As I said before,
> it
> >> > seems that it is possible to evaluate the grammar files that use the
> DSL
> >> > using something like quickjs as well
> >> >
> >>
> >> That's not possible, no, at least not without a lot of complications
> that
> >> do not seem worth the price, compared to installing Node.js.  And note
> >> that even if that were feasible, it would only solve the first half of
> the
> >> problem: to transform a grammar.js file into its corresponding parser.c
> >> file, you also need the tree-sitter command line program.
> >
> > Maybe a better question is - is it possible to adapt the semantic
> > parser generators (or others in emacs) to create the ".c" files for
> > use with libtreesitter?
>
> This is possible in principle; I've thought about doing it with the
> wisitoken parser generator.
>
> However, the format/struture/details of the output is not documented,
> and may change in future tree-sitter releases.
>


True, *but* each parser has as an "ABI version" constant encoded into it,
and the source states the library is supposed to be backwards compatible
(currently 14, the the grammar.c files I saw are at 13).  So if we get
something that complies with version 13, it should be fine.

I've looked over a couple of the parser.c files, and they appear to be
pretty standard implementations of.LR stack automata. The CLI tool supports
ambiguous grammars, but if you start from one an existing LR-type of parser
generator (bison or wisent, say) can handle, then that piece should be
relatively straightforward.
The novel feature appears to be the automatic derivation of a "flattened"
AST data structure, which the library uses to construct ASTs for a given
action.

I did come up with my own calculation of a flattened AST structure a year
or two ago, though I wasn't thrilled with some of the results I got -
automatically picking a symbol to use to "break" recursive inclusion by
reverting back to a pointer was tricky.  I believe I went with a heuristic
of converting the symbol that had the most incoming edges to a reference in
each place it appeared as a field, then repeated until there were no data
structures that were required to physically contain themselves.  I used
that rule because it seemed like it would minimize the number of symbols
not allowed to appear as fields, thus maximizing the flattening effect (too
well, in fact - I had to introduce some constraints to keep some of the
nonlinear structure).

So it might be interesting to go through tree-sitters' algorithm to see
what they came up with.

Lynn

[-- Attachment #2: Type: text/html, Size: 3790 bytes --]

  reply	other threads:[~2022-12-29 22:37 UTC|newest]

Thread overview: 133+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-16 14:47 Tree-sitter introduction documentation Perry Smith
2022-12-16 15:06 ` Eli Zaretskii
2022-12-16 15:24   ` João Távora
2022-12-16 15:36     ` Perry Smith
2022-12-16 15:43       ` João Távora
2022-12-16 17:56       ` Philip Kaludercic
2022-12-16 15:38     ` Eli Zaretskii
2022-12-16 15:48       ` João Távora
2022-12-16 15:53         ` Perry Smith
2022-12-16 16:02           ` João Távora
2022-12-18  9:59             ` Eli Zaretskii
2022-12-18 14:07               ` Perry Smith
2022-12-18 17:18                 ` Eli Zaretskii
2022-12-16 16:34         ` Eli Zaretskii
2022-12-17  0:03           ` Tim Cross
2022-12-17  8:42             ` Eli Zaretskii
2022-12-17 10:40               ` João Távora
2022-12-17 11:00                 ` Eli Zaretskii
2022-12-18  0:40               ` Tim Cross
2022-12-16 16:01       ` Manuel Giraud
2022-12-16 16:40         ` Eli Zaretskii
2022-12-16 16:47           ` Perry Smith
2022-12-16 17:21             ` Eli Zaretskii
2022-12-16 15:53     ` Manuel Giraud
2022-12-16 15:56       ` João Távora
2022-12-16 16:39       ` Eli Zaretskii
2022-12-16 17:15         ` Manuel Giraud
2022-12-16 17:23           ` Eli Zaretskii
2022-12-16 20:22             ` Ken Brown
2022-12-17  4:06               ` Tim Cross
2022-12-17 15:42                 ` Stefan Monnier
2022-12-17 17:41                   ` T.V Raman
2022-12-26 22:42                   ` Dmitry Gutov
2022-12-27 12:11                     ` Eli Zaretskii
2022-12-27 12:43                       ` Dmitry Gutov
2022-12-27 13:38                         ` Eli Zaretskii
2022-12-27 14:11                           ` Dmitry Gutov
2022-12-27 14:32                             ` Eli Zaretskii
2022-12-27 16:36                               ` Stefan Monnier
2022-12-27 16:44                                 ` Philip Kaludercic
2022-12-27 17:16                                   ` Eli Zaretskii
2022-12-27 17:20                                     ` Philip Kaludercic
2022-12-27 18:06                                       ` Eli Zaretskii
2022-12-27 17:33                                     ` Stefan Monnier
2022-12-30 11:06                                   ` Yuan Fu
2022-12-30 11:25                                     ` Philip Kaludercic
2022-12-30 11:54                                       ` tomas
2022-12-30 11:59                                         ` Philip Kaludercic
2022-12-30 12:27                                           ` tomas
2022-12-30 12:45                                             ` Philip Kaludercic
2022-12-30 14:26                                             ` Dmitry Gutov
2022-12-30 23:33                                       ` Yuan Fu
2022-12-30 15:31                                     ` Eli Zaretskii
2022-12-30 15:54                                       ` Philip Kaludercic
2022-12-30 16:17                                         ` Eli Zaretskii
2022-12-31  0:06                                         ` Yuan Fu
2022-12-31  0:12                                           ` Philip Kaludercic
2023-01-01  1:18                                             ` Yuan Fu
2023-01-02 19:10                                               ` [SPAM UNSURE] " Stephen Leake
2022-12-31  0:03                                       ` Yuan Fu
2022-12-31  0:25                                         ` Stefan Monnier
2023-01-01  1:16                                           ` Yuan Fu
2023-01-01  6:39                                             ` Eli Zaretskii
2023-01-02  0:31                                               ` Yuan Fu
2023-01-02  0:40                                                 ` Stefan Monnier
2023-01-03  6:58                                                   ` Yuan Fu
2023-01-02  3:34                                                 ` Eli Zaretskii
2022-12-31  9:24                                         ` Eli Zaretskii
2022-12-31 22:14                                           ` Yuan Fu
2023-01-01  1:12                                             ` Yuan Fu
2022-12-31  0:44                                       ` Gregory Heytings
2023-01-03  4:08                                       ` Richard Stallman
2023-01-03 12:14                                         ` Eli Zaretskii
2023-01-01  3:03                                     ` Richard Stallman
2023-01-01  6:54                                       ` Eli Zaretskii
2023-01-01 19:14                                         ` Gregory Heytings
2023-01-01 20:11                                           ` Eli Zaretskii
2023-01-03  4:06                                         ` Richard Stallman
2023-01-03 12:06                                           ` Eli Zaretskii
2022-12-27 17:10                                 ` Eli Zaretskii
2022-12-27 17:31                                   ` Stefan Monnier
2022-12-27 18:08                                     ` Eli Zaretskii
2022-12-27 18:44                                       ` Stefan Monnier
2022-12-27 20:06                                         ` Philip Kaludercic
2022-12-27 21:13                                           ` Stefan Monnier
2022-12-28  2:52                                             ` Yuan Fu
2022-12-28 13:10                                               ` Gregory Heytings
2022-12-28 13:38                                               ` Lynn Winebarger
2022-12-28 14:41                                                 ` Danny Freeman
2022-12-29 11:14                                               ` Philip Kaludercic
2022-12-29 15:27                                                 ` Gregory Heytings
2022-12-29 15:40                                                   ` Lynn Winebarger
2022-12-29 21:50                                                     ` [SPAM UNSURE] " Stephen Leake
2022-12-29 22:37                                                       ` Lynn Winebarger [this message]
2022-12-30 14:10                                                       ` Lynn Winebarger
2022-12-30 16:25                                                         ` Targeting libtreesitter from wisent and other parser generators for emacs Lynn Winebarger
2022-12-31  8:25                                                           ` Eli Zaretskii
2022-12-31 13:07                                                             ` Lynn Winebarger
2022-12-29 15:45                                                   ` Tree-sitter introduction documentation Philip Kaludercic
2022-12-29 17:00                                                     ` Gregory Heytings
2022-12-29 17:12                                                       ` Philip Kaludercic
2022-12-29 17:31                                                         ` Gregory Heytings
2022-12-29 18:12                                                           ` Philip Kaludercic
2022-12-29 18:28                                                             ` Eli Zaretskii
2022-12-29 18:44                                                               ` Stefan Monnier
2022-12-29 19:34                                                                 ` Eli Zaretskii
2022-12-29 19:48                                                                   ` Stefan Monnier
2022-12-29 19:59                                                                     ` Eli Zaretskii
2022-12-29 18:32                                                             ` Stefan Monnier
2022-12-29 16:32                                                   ` Eli Zaretskii
2022-12-29 16:53                                                     ` Philip Kaludercic
2022-12-29 16:59                                                       ` Eli Zaretskii
2022-12-29 17:01                                                         ` Philip Kaludercic
2022-12-29 17:03                                                       ` Stefan Monnier
2022-12-29 17:12                                                         ` Gregory Heytings
2022-12-29 17:13                                                         ` Philip Kaludercic
2022-12-29 17:04                                                     ` Gregory Heytings
2022-12-30  1:01                                                 ` Gregory Heytings
2022-12-30 11:00                                                   ` Philip Kaludercic
2022-12-30 12:07                                                     ` Gregory Heytings
2022-12-30 13:10                                                       ` Philip Kaludercic
2022-12-30 15:23                                                         ` Gregory Heytings
2022-12-28 12:56                                         ` Gregory Heytings
2022-12-28 14:41                                           ` Stefan Monnier
2022-12-27 19:53                                       ` Dmitry Gutov
2023-01-01  3:03                                         ` Richard Stallman
2022-12-27 13:51                         ` tomas
2022-12-27 15:58                         ` Stefan Monnier
2022-12-16 17:23   ` Perry Smith
2022-12-16 17:31     ` Eli Zaretskii
2022-12-16 19:08       ` Perry Smith
2022-12-16 19:37         ` Eli Zaretskii
2022-12-16 20:05           ` Perry Smith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM=F=bAVpdcZV7uhTRkhih17SOea6JG+oegjCkZPy0LesRae=A@mail.gmail.com' \
    --to=owinebar@gmail.com \
    --cc=casouri@gmail.com \
    --cc=dgutov@yandex.ru \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=gregory@heytings.org \
    --cc=monnier@iro.umontreal.ca \
    --cc=philipk@posteo.net \
    --cc=stephen_leake@stephe-leake.org \
    --cc=theophilusx@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).