From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?utf-8?Q?Bj=C3=B6rn?= Bidar Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter maturity Date: Mon, 30 Dec 2024 01:29:25 +0200 Message-ID: <47960.1628541545$1735515025@news.gmane.org> References: <1ed88fca-788a-fe9f-b6c8-edb2f49751c9@mavit.org.uk> <67428b3d.c80a0220.2f3036.adbdSMTPIN_ADDED_BROKEN@mx.google.com> <86ldwdm7xg.fsf@gnu.org> <6765355b.c80a0220.1a6b24.3117SMTPIN_ADDED_BROKEN@mx.google.com> <00554790-CACA-4233-8846-9E091CF1F7AA@gmail.com> <86msgl2red.fsf@gnu.org> <87o710sr7y.fsf@debian-hx90.lan> <8734i9tmze.fsf@posteo.net> <86plldwb7w.fsf@gnu.org> <87ttapryxr.fsf@posteo.net> <0883EB00-3BB2-4BC8-95D1-45F4497C0526@dancol.org> <87msge8bv8.fsf@dancol.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="828"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Lynn Winebarger , Philip Kaludercic , emacs-devel , Eli Zaretskii , Richard Stallman , manphiz@gmail.com To: Daniel Colascione Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Dec 30 00:30:18 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tS2jg-000Acz-Qf for ged-emacs-devel@m.gmane-mx.org; Mon, 30 Dec 2024 00:30:17 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tS2j0-0001fL-6h; Sun, 29 Dec 2024 18:29:34 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tS2iy-0001fD-Fk for emacs-devel@gnu.org; Sun, 29 Dec 2024 18:29:32 -0500 Original-Received: from thaodan.de ([185.216.177.71]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tS2iw-0000RM-4z; Sun, 29 Dec 2024 18:29:32 -0500 Original-Received: from odin (dsl-trebng12-50dc7b-49.dhcp.inet.fi [80.220.123.49]) by thaodan.de (Postfix) with ESMTPSA id A418ED00096; Mon, 30 Dec 2024 01:29:26 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=thaodan.de; s=mail; t=1735514966; bh=JzD4m/xgDfU5K2QsAO8Lpsa0/yimTfC9yJO5WbCoZNo=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=Wwn5InIzRoqX1gB5WkYgB3RnohJsIiHIrlhWF+vKyDsZ1SG/rgbSpDq0EsIZIwybs cU4Os9JvgzUY+mqI8ckVCFZSm5LcOvYohQU0boWgiokrOlaagYmeM49Bp9IbhHTuty 77raAbycfJbRqr8CnNLvpSYeTuWUJLVml+g+NBcT7ZMJhcpza45Xd/sz8C1mdFVz1/ gwVNHBfELpLC1izd+GXfjF3faTxZuKhXladwjfIZe71NqpjEcRy+9Zg6EaobZ0PoTn gyEU3AMl956qs9qSB8gjtL8X5UcKQLIktRgsDA79KvxSLipxPWAxXz40e/W8pydhjQ q2qVfyVr2iXJWilQPBgMSpyQWBv+4xqlEb76t8I3Zy7E4gpACg4o24FbKltkqzk4mG TlmPi8hNHAU1pXhVKOAKcmYmW0h7Iqz/qBn/5MoAS6Y18BGV1zYRtzQJo5NT8QYAGj MqVrCKvcT75Wy3HFk4d0gcCEDvxuqp4gThWVuhdTqfrazBoLlISUGlXI8OAeZfWMUf Y1nybLS8wwCIx1H4/k71fzw+TQA6SGelOKyt1YtKc3oQ71sAqRmdp/RnHEA26jRrwZ juZndGpiY9VJhkD2owVcs9OIHida5PLiC5CtLBokTUY+fVPKkpf9DHXIYUrnIQPmEc AvLukKtIZKyYjhZRars+Iw5w= In-Reply-To: <87msge8bv8.fsf@dancol.org> (Daniel Colascione's message of "Sun, 29 Dec 2024 15:36:59 -0500") Autocrypt: addr=bjorn.bidar@thaodan.de; prefer-encrypt=nopreference; keydata= mDMEZNfpPhYJKwYBBAHaRw8BAQdACBEmr+0xwIIHZfIDlZmm7sa+lHHSb0g9FZrN6qE6ru60JUJq w7ZybiBCaWRhciA8Ympvcm4uYmlkYXJAdGhhb2Rhbi5kZT6IlgQTFgoAPgIbAwULCQgHAgIiAgYV CgkICwIEFgIDAQIeBwIXgBYhBFHxdut1RzAepymoq1wbdKFlHF9oBQJk1/YmAhkBAAoJEFwbdKFl HF9oB9cBAJoIIGQKXm4cpap+Flxc/EGnYl0123lcEyzuduqvlDT0AQC3OlFKm/OiqJ8IMTrzJRZ8 phFssTkSrrFXnM2jm5PYDoiTBBMWCgA7FiEEUfF263VHMB6nKairXBt0oWUcX2gFAmTX6T4CGwMF CwkIBwICIgIGFQoJCAsCBBYCAwECHgcCF4AACgkQXBt0oWUcX2hbCQEAtru7kvM8hi8zo6z9ux2h K+B5xViKuo7Z8K3IXuK5ugwA+wUfKzomzdBPhfxDsqLcEziGRxoyx0Q3ld9aermBUccHtBxCasO2 cm4gQmlkYXIgPG1lQHRoYW9kYW4uZGU+iJMEExYKADsCGwMFCwkIBwICIgIGFQoJCAsCBBYCAwEC HgcCF4AWIQRR8XbrdUcwHqcpqKtcG3ShZRxfaAUCZNf2FQAKCRBcG3ShZRxfaCzSAP4hZ7cSp0YN XYpcjHdsySh2MuBhhoPeLGXs+2kSiqBiOwD/TP8AgPEg/R+SI9GI9on7fBJJ0mp2IT8kZ2rhDOjg gA6IkwQTFgoAOxYhBFHxdut1RzAepymoq1wbdKFlH Received-SPF: pass client-ip=185.216.177.71; envelope-from=bjorn.bidar@thaodan.de; helo=thaodan.de X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, INVALID_MSGID=0.568, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:327376 Archived-At: Daniel Colascione writes: > Lynn Winebarger writes: > >> On Fri, Dec 27, 2024, 9:25=E2=80=AFAM Daniel Colascione wrote: >> >>> >>> >>> It's a shame there's no way to write TS grammars in plain elisp. I figu= re >>> vendoring both the source and the generated code would be best, as it'd >>> allow building Emacs anywhere but still make it convenient on systems w= ith >>> needed tools (JS runtime, Rust, etc.) to update and modify the grammar.= As >>> with any scheme involving checking in generated outputs, the source and >>> output can get out of sync, but I think there are build time guardrails= we >>> can build to make sure it doesn't happen. >>> >> >> I looked into this last year. The tree-sitter library provides a parsing >> engine that references a fairly standard LR type parsing table in binary >> form. I got stuck in adding a generic primitive functionality for readi= ng >> and writing arbitrary binary data structures based on a data description >> DSL, since I wouldn't want to tie the interpreter core to the data >> structures of an external, dynamically-loadable library. But, I wasn't >> sure such an extension would be accepted into emacs, as I am not an expe= rt >> on the possible security implications. >> >> Other than that, emacs already has the code for calculating (LA)LR parsi= ng >> tables in the semantic packages. The tree-sitter grammar compiler may h= ave >> additional logic for providing multiple starting symbols, but the parsing >> engine should still function with a classic parsing table. > > Thanks. Such an approach would let us treat tree-sitter grammars a lot > more like font-lock-keywords, and I think for some modes, that'd be a > good option. (Of course, SHTDI.) > > Tree sitter, as wonderful as it is, strikes me as a bit of a Rube > Goldberg machine architecturally: JS *and* Rust *and* C? Really? :-) I was wondering the same. How the hell? There had been some talks to support a more lightweight JavaScript interpreter as an alternative but it hasn't gone anyway. Somehow because compatibility reason. I don't how could node be dependency for these. Grammars are mostly without dependencies except some have dependencies to other grammars on the source level such as the C++ require the C grammar. > Do you happen to know whether the subset of Rust that gccrs recognizes > is sufficient to compile the tree sitter grammar compiler? If so, we > could in principle combine gccrs with a bare-bones embedded JS > interpreter like https://duckjs.org/ to produce a mechanism that would > let us customize and rebuild tree sitter grammars as easily as we do > elisp files, even on obscure platforms like DJGPP. I don't know 100% but it does not look that way reading their latest report: - https://raw.githubusercontent.com/Rust-GCC/Reporting/refs/heads/main/2024= /2024-12-09-report.org - https://rust-gcc.github.io/2024/12/02/2024-11-monthly-report.html Really strange that GCCrs doesn't use sourceware.org