From: "Björn Bidar" <bjorn.bidar@thaodan.de>
To: Lynn Winebarger <owinebar@gmail.com>
Cc: Daniel Colascione <dancol@dancol.org>,
Philip Kaludercic <philipk@posteo.net>,
emacs-devel <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org>,
Richard Stallman <rms@gnu.org>,
manphiz@gmail.com
Subject: Re: Tree-sitter maturity
Date: Sun, 05 Jan 2025 01:21:23 +0200 [thread overview]
Message-ID: <23250.1399282896$1736032971@news.gmane.org> (raw)
In-Reply-To: <CAM=F=bDDHvY0qh19kc4fyu8pSfHW95wE-nenSZgCbrMbH6k6+w@mail.gmail.com> (Lynn Winebarger's message of "Sat, 4 Jan 2025 11:15:22 -0500")
Lynn Winebarger <owinebar@gmail.com> writes:
> On Wed, Jan 1, 2025 at 3:23 PM Björn Bidar <bjorn.bidar@thaodan.de> wrote:
>> Lynn Winebarger <owinebar@gmail.com> writes:
>> >> Tree sitter, as wonderful as it is, strikes me as a bit of a Rube
>> >> Goldberg machine architecturally: JS *and* Rust *and* C? Really? :-)
>> >
>> > They evidently decided to use JSON and a simple schema to specify the
>> > concrete grammar, instead of creating a DSL for the purpose.
>> > Javascript is just a convenient way for embedding code into JSON the
>> > same way LISP programmers use lisp to generate S-expressions. Once
>> > you have the JSON format generated, javascript is not used.
>> >
>> > The rest of the project is really composed of orthogonal components,
>> > the GLR grammar compiler (written in Rust) and the run-time GLR
>> > parsing engine, written in C. The grammar compiler produces the
>> > parsing tables in the form of C source code that is compiled together
>> > with the library for a single library per grammar, but the C library
>> > does not actually require the parsing tables to be statically known at
>> > compile-time, at least the last I looked, unless some really obscure
>> > dependence. The procedural interface to the parser just takes a
>> > pointer to the parser table data structure at run-time.
>> >
>> > Since GLR grammars are basically arbitrary (ambiguous) LR(1) grammars,
>> > the parser run-time has to implement a fairly sophisticated algorithm
>> > (graph-stacks) to be efficient. Having implemented the LALR parser
>> > generator at least 3 times in the last couple of decades (just for my
>> > own use), generating the parse tables looks like a lot simpler (and
>> > well-understood) problem to solve than the GLR run-time. More
>> > importantly, the efficiency of the grammar compiler is not all that
>> > critical compared to the run-time.
>> >
>>
>> Additional alernatives instead of Node are already a good alternative.
>> Using WASM as the output format also does not sound bad assuming their
>> is some abstraction from the tree-sitter library side.
>
> I'm not sure why WASM would be interesting. AFAICT, it's just another
> set of bindings to the C library, maybe with the tables compiled into
> WASM binary module (or whatever the correct term should be - I'm not a
> WASM expert). In any case, AFAIK Emacs has no particular capability
> for using WASM files as dynamic libraries in general. Maybe if Emacs
> itself was compiled to WASM, in which case I suppose the function for
> dynamically loading libraries would implicitly load such modules.
>
> OTOH, the generated WASM bindings might provide an example of using
> the tree-sitter DLL with the in-memory parse table structure not
> embedded in the tree-sitter DLL. Is that what you meant?
Maybe I missunderstood but my assumption was that the newer WASM parsers
would be less prone to breakage. But if it's just about compiling the
same code generated to WASM then I don't see the benefit either.
>> > I agree, a generic grammar capturing the structures of most
>> > programming languages would be useful. It is definitely possible to
>> > extract the syntactic/semantic concepts from C++ and Python to create
>> > such a grammar, if you are willing to allow nested grammars
>> > appropriately delimited. For example, a constructor context would
>> > delimit an expression in a data language that is embedded in a
>> > constructor context that may itself have delimited value contexts
>> > where the functional/procedural grammar may appear, ad infinitum. The
>> > procedural and data grammars are distinct but mutually recursive.
>> > That would be if the form appeared in an rvalue-context. For l-value
>> > expressions, the same constructor delimiting syntax can become a
>> > binding form, at least, with subexpressions of binding forms also
>> > being binding forms. As long as the scanner is dynamically set
>> > according to the grammar context (and recognizes/signals the closing
>> > delimiter), the grammar can be made non-ambiguous because a given
>> > character will produce context-appropriate terminal symbols.
>>
>> What kind of scanner are you referring to? Something that works like a
>> binding generator but for AST?
>
> Aside from being useful for generic templating purposes, Such a
> generic grammar would be of use for the purpose Daniel described, i.e.
> a layer of abstraction usable for almost any modern language, even in
> polyglot texts.
>
This exactly what I wondering too. Some languages embed others into
themselves or are hybris. Good examples would be Python inside a
template and QML is Markup but also JavaScript depending on the context.
A more flexible grammar system would help here.
Kinda like reinventing semantic again..
>> > As for vendoring, I just doubt you will get much buy-in in this forum.
>> > There are corporate-type free/open-source software projects that
>> > prioritize uniformity in build environments and limiting the scope of
>> > bugs that can arise from the build process/dependencies that vendor at
>> > the drop of the hat. Then there are "classic" free software projects
>> > that have amalgamated the work of many individual contributors, and
>> > those contributors often prioritize control of the software running on
>> > their systems for whatever reason (but eliminating non-free software
>> > is definitely one of them), and they often can/will contribute patches
>> > for that purpose. The second camp *hates* vendoring because it
>> > subverts their control of their computational resources. At least,
>> > that's the dichotomy I see. There are probably finer points I'm
>> > missing or mischaracterizing.
>>
>> From my point as a distribution packager there are several reason why
>> vendoring can be bad or in some context keeping them is the better
>> decision.
>>
>> But in this context it complicates the build process as now each grammar
>> has to be built for Emacs in addition to another editors.
>> The Emacs package now pulls in more build dependencies at built time
>> which complicates the built process as the dependency grows.
>>
>> Besides bundled dependencies are not allowed unless there's no way to
>> avoid them. It is not about control or anything.
>
> That sounds like something I would interpret as control. Distro
> creators/maintainers are prime candidates for wanting to maintain
> control of the build/run-time environment, as they are responsible for
> everything they bundle working together. Perhaps "control of their
> computational resources" is more specific than I intended in my
> previous posting.
>
Yeah you are right.
next prev parent reply other threads:[~2025-01-04 23:21 UTC|newest]
Thread overview: 207+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-20 15:13 My resignation from Emacs development Alan Mackenzie
2024-11-20 15:34 ` Eli Zaretskii
2024-11-20 16:23 ` Christopher Dimech
2024-11-21 6:22 ` Gerd Möllmann
2024-11-21 10:05 ` Christopher Dimech
2024-11-21 11:23 ` Gerd Möllmann
2024-11-21 11:40 ` Eli Zaretskii
2024-11-21 10:29 ` Alan Mackenzie
2024-11-21 12:26 ` Christopher Dimech
2024-11-20 16:42 ` Alfred M. Szmidt
2024-11-20 17:04 ` tomas
2024-11-20 21:56 ` Dmitry Gutov
2024-11-21 2:28 ` Stefan Kangas
2024-11-21 12:34 ` Tree-sitter maturity (was: My resignation from Emacs development) Peter Oliver
2024-11-23 13:41 ` Stefan Kangas
2024-11-24 2:10 ` Tree-sitter maturity Björn Bidar
[not found] ` <67428b3d.c80a0220.2f3036.adbdSMTPIN_ADDED_BROKEN@mx.google.com>
2024-12-17 22:11 ` Yuan Fu
2024-12-18 13:34 ` Eli Zaretskii
2024-12-19 1:40 ` Yuan Fu
2024-12-19 8:17 ` Eli Zaretskii
2024-12-20 9:13 ` Björn Bidar
[not found] ` <6765355b.c80a0220.1a6b24.3117SMTPIN_ADDED_BROKEN@mx.google.com>
2024-12-20 9:29 ` Yuan Fu
2024-12-23 0:43 ` Björn Bidar
[not found] ` <6768b256.c80a0220.222b1b.64e6SMTPIN_ADDED_BROKEN@mx.google.com>
2024-12-24 1:20 ` Yuan Fu
[not found] ` <87frmfxm8y.fsf@>
2024-12-24 4:52 ` Richard Stallman
2024-12-24 12:32 ` Eli Zaretskii
2024-12-24 21:31 ` Xiyue Deng
2024-12-26 4:30 ` Richard Stallman
2024-12-27 10:54 ` Philip Kaludercic
2024-12-27 12:40 ` Eli Zaretskii
2024-12-27 13:46 ` Daniel Colascione
2024-12-27 14:19 ` Philip Kaludercic
2024-12-27 14:24 ` Daniel Colascione
2024-12-27 14:57 ` Philip Kaludercic
2024-12-27 15:02 ` Philip Kaludercic
2024-12-29 4:19 ` Richard Stallman
2024-12-29 4:23 ` Daniel Colascione
2024-12-29 7:44 ` Eli Zaretskii
2024-12-29 8:01 ` Daniel Colascione
2024-12-29 8:41 ` Eli Zaretskii
2024-12-29 8:59 ` Yuan Fu
2024-12-29 9:14 ` Daniel Colascione
2024-12-29 9:24 ` Eli Zaretskii
2024-12-29 10:01 ` Daniel Colascione
2024-12-29 13:35 ` Eli Zaretskii
2024-12-29 20:12 ` Daniel Colascione
2024-12-29 10:13 ` tomas
2024-12-29 10:21 ` Yuan Fu
2024-12-29 14:59 ` Daniel Colascione
2024-12-29 14:14 ` Dmitry Gutov
2024-12-29 7:26 ` Eli Zaretskii
[not found] ` <904957B9-55C1-42DF-BE6A-16986A4B539A@dancol.org>
[not found] ` <87r05o2eji.fsf@posteo.net>
[not found] ` <E2C32D27-EEC2-4DD2-B6F6-8827820B880E@dancol.org>
2024-12-31 16:47 ` Philip Kaludercic
2024-12-29 14:36 ` Lynn Winebarger
2024-12-29 20:36 ` Daniel Colascione
2024-12-29 23:29 ` Björn Bidar
[not found] ` <6771db94.050a0220.386e00.e451SMTPIN_ADDED_BROKEN@mx.google.com>
2024-12-30 0:30 ` Yuan Fu
2024-12-30 0:36 ` Daniel Colascione
2024-12-30 1:00 ` Yuan Fu
2024-12-31 9:48 ` Philip Kaludercic
2024-12-30 3:20 ` Lynn Winebarger
2024-12-31 3:22 ` Björn Bidar
2024-12-31 22:29 ` Lynn Winebarger
2025-01-01 20:23 ` Björn Bidar
[not found] ` <6775a459.170a0220.2f3d1e.1897SMTPIN_ADDED_BROKEN@mx.google.com>
2025-01-04 16:15 ` Lynn Winebarger
2025-01-04 17:39 ` Daniel Colascione
2025-01-04 18:57 ` Eli Zaretskii
2025-01-04 19:30 ` Daniel Colascione
2025-01-04 20:12 ` Eli Zaretskii
2025-01-04 20:46 ` Daniel Colascione
2025-01-04 20:57 ` Eli Zaretskii
2025-01-04 21:18 ` Daniel Colascione
2025-01-05 6:13 ` Eli Zaretskii
2025-01-04 21:25 ` Lynn Winebarger
2025-01-04 21:34 ` Daniel Colascione
2025-01-04 23:21 ` Björn Bidar [this message]
2024-12-28 12:20 ` Peter Oliver
2024-12-28 12:23 ` Philip Kaludercic
2024-12-29 14:50 ` Björn Bidar
2024-12-27 14:59 ` Eli Zaretskii
2024-12-27 15:05 ` Daniel Colascione
2024-12-27 15:31 ` Eli Zaretskii
2024-12-27 15:37 ` Daniel Colascione
2024-12-28 1:08 ` Stefan Kangas
2024-12-29 4:19 ` Richard Stallman
2024-12-29 4:21 ` Daniel Colascione
2024-12-29 6:41 ` tomas
2024-12-29 6:43 ` Daniel Colascione
2024-12-29 6:54 ` tomas
2024-12-29 7:05 ` Daniel Colascione
2024-12-29 8:56 ` tomas
2024-12-29 15:16 ` Björn Bidar
2024-12-29 15:05 ` Björn Bidar
[not found] ` <87ed1qedhl.fsf@>
2024-12-29 15:21 ` Daniel Colascione
2024-12-29 16:02 ` Björn Bidar
[not found] ` <663726A2-141B-4B98-80FB-BD93E99AC122@dancol.org>
2024-12-29 19:06 ` Björn Bidar
[not found] ` <6771d84b.050a0220.250914.d0e0SMTPIN_ADDED_BROKEN@mx.google.com>
2024-12-30 0:56 ` Yuan Fu
2024-12-27 14:11 ` Philip Kaludercic
2024-12-27 15:06 ` Eli Zaretskii
2024-12-31 13:47 ` Philip Kaludercic
2024-12-27 18:29 ` Ihor Radchenko
2024-12-28 7:55 ` Eli Zaretskii
2024-12-28 8:11 ` Ihor Radchenko
2024-12-28 8:58 ` Eli Zaretskii
2024-12-29 15:09 ` Björn Bidar
2024-12-26 4:32 ` Richard Stallman
2024-12-26 7:12 ` Eli Zaretskii
2024-12-29 14:35 ` Björn Bidar
2024-12-19 12:23 ` Peter Oliver
2024-12-19 12:42 ` Eli Zaretskii
2024-12-19 13:15 ` Vincenzo Pupillo
2024-12-20 8:59 ` Björn Bidar
2024-11-21 13:01 ` My resignation from Emacs development Alan Mackenzie
2024-11-21 13:48 ` Eli Zaretskii
2024-11-21 14:29 ` Alfred M. Szmidt
2024-11-22 0:01 ` Po Lu
2024-11-22 7:03 ` Eli Zaretskii
2024-11-22 8:14 ` Robert Pluim
2024-11-22 8:32 ` Eli Zaretskii
2024-11-22 23:59 ` Po Lu
2024-11-23 6:39 ` Eli Zaretskii
2024-11-21 16:29 ` Alan Mackenzie
2024-11-22 5:35 ` Adam Porter
2024-11-22 7:24 ` Madhu
2024-11-22 8:11 ` Eli Zaretskii
2024-11-22 9:26 ` Madhu
2024-11-22 12:07 ` Eli Zaretskii
2024-11-22 12:40 ` Stefan Kangas
2024-11-22 13:06 ` Alan Mackenzie
2024-11-22 13:39 ` Stefan Kangas
2024-11-22 14:25 ` Eli Zaretskii
2024-11-25 4:28 ` Richard Stallman
2024-11-26 17:37 ` Alan Mackenzie
2024-12-13 4:35 ` Richard Stallman
2024-12-15 15:27 ` Alan Mackenzie
2024-12-15 15:48 ` Eli Zaretskii
2024-12-15 20:43 ` Alan Mackenzie
2024-12-19 4:22 ` Richard Stallman
2024-12-19 8:26 ` Eli Zaretskii
2024-11-23 22:18 ` Andrea Corallo
2024-11-22 10:57 ` Alan Mackenzie
2024-11-22 23:19 ` Adam Porter
2024-11-26 19:01 ` Daniel Radetsky
2024-11-26 19:51 ` Christopher Dimech
2024-11-27 2:18 ` Adam Porter
2024-11-27 9:36 ` Daniel Radetsky
2024-11-27 9:59 ` Christopher Dimech
2024-11-30 3:52 ` Richard Stallman
2024-11-30 7:53 ` Eli Zaretskii
2024-11-30 16:22 ` Discuss new features/enhancements or large changes for users in emacs-devel [was: My resignation from Emacs development] Drew Adams
2024-11-30 16:56 ` Eli Zaretskii
2024-11-30 21:06 ` [External] : " Drew Adams
2024-12-01 6:00 ` Eli Zaretskii
2024-12-03 7:26 ` My resignation from Emacs development Richard Stallman
2024-12-03 13:33 ` Eli Zaretskii
2024-11-30 16:21 ` Discuss new features/enhancements or large changes for users in emacs-devel [was My resignation from Emacs development] Drew Adams
2024-11-30 17:05 ` Eli Zaretskii
2024-11-30 21:09 ` [External] : " Drew Adams
2024-12-01 6:12 ` Eli Zaretskii
2024-12-01 19:23 ` Drew Adams
2024-12-03 7:25 ` Richard Stallman
2024-12-03 13:32 ` Eli Zaretskii
2024-12-06 4:48 ` Richard Stallman
2024-12-02 4:09 ` Richard Stallman
2024-12-02 13:04 ` Discuss new features/enhancements or large changes for users in emacs-devel Eli Zaretskii
2024-12-02 15:32 ` [External] : " Drew Adams
2024-12-05 5:08 ` Richard Stallman
2024-12-05 6:33 ` Eli Zaretskii
2024-12-02 15:29 ` [External] : Re: Discuss new features/enhancements or large changes for users in emacs-devel [was My resignation from Emacs development] Drew Adams
2024-11-27 2:06 ` My resignation from Emacs development Adam Porter
2024-11-27 9:17 ` Daniel Radetsky
2024-11-22 15:36 ` Stefan Kangas
2024-11-22 17:48 ` Alan Mackenzie
2024-11-23 23:43 ` Stefan Monnier via Emacs development discussions.
2024-11-23 6:10 ` Richard Stallman
2024-11-23 7:48 ` Eli Zaretskii
2024-11-23 11:06 ` Christopher Dimech
2024-11-23 11:54 ` Eli Zaretskii
2024-11-23 12:48 ` Christopher Dimech
2024-11-23 23:59 ` Adam Porter
2024-12-01 3:50 ` Sean Whitton
2024-12-01 6:19 ` tomas
2024-11-24 18:12 ` Suhail Singh
2024-11-26 4:56 ` Richard Stallman
2024-11-26 7:38 ` Suhail Singh
2024-11-21 5:59 ` Gerd Möllmann
2024-11-22 11:36 ` Alan Mackenzie
2024-11-22 11:52 ` Eli Zaretskii
2024-11-23 10:36 ` Alan Mackenzie
2024-11-23 11:31 ` Eli Zaretskii
2024-11-21 13:39 ` Andrea Corallo
2024-11-21 19:01 ` Alfred M. Szmidt
2024-11-21 19:19 ` Christopher Dimech
2024-11-21 19:47 ` Eli Zaretskii
2024-11-21 19:40 ` Jim Porter
2024-11-24 4:35 ` Richard Stallman
2024-11-21 23:57 ` Po Lu
2024-11-22 17:26 ` On committing significant and/or controversial changes (was: My resignation from Emacs development) Ihor Radchenko
2024-11-22 17:47 ` Ship Mints
2024-11-22 19:04 ` Eli Zaretskii
2024-11-24 2:35 ` On committing significant and/or controversial changes Björn Bidar
2024-11-24 4:41 ` Adam Porter
2024-11-30 2:16 ` Björn Bidar
[not found] ` <87ttbx73zu.fsf@>
2024-11-24 8:26 ` Eli Zaretskii
2024-11-22 19:01 ` Eli Zaretskii
2024-11-23 6:10 ` My resignation from Emacs development Richard Stallman
2024-11-23 8:50 ` Eli Zaretskii
2024-11-23 6:10 ` Richard Stallman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='23250.1399282896$1736032971@news.gmane.org' \
--to=bjorn.bidar@thaodan.de \
--cc=dancol@dancol.org \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=manphiz@gmail.com \
--cc=owinebar@gmail.com \
--cc=philipk@posteo.net \
--cc=rms@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.