From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?utf-8?Q?Bj=C3=B6rn?= Bidar Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter maturity Date: Mon, 23 Dec 2024 02:43:09 +0200 Message-ID: <30474.9437769473$1734914665@news.gmane.org> References: <1ed88fca-788a-fe9f-b6c8-edb2f49751c9@mavit.org.uk> <67428b3d.c80a0220.2f3036.adbdSMTPIN_ADDED_BROKEN@mx.google.com> <86ldwdm7xg.fsf@gnu.org> <6765355b.c80a0220.1a6b24.3117SMTPIN_ADDED_BROKEN@mx.google.com> <00554790-CACA-4233-8846-9E091CF1F7AA@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="22610"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Eli Zaretskii , Peter Oliver , Stefan Kangas , emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Dec 23 01:44:17 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tPWYS-0005jP-Id for ged-emacs-devel@m.gmane-mx.org; Mon, 23 Dec 2024 01:44:16 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tPWXX-00023n-D5; Sun, 22 Dec 2024 19:43:19 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tPWXV-00023V-6Y for emacs-devel@gnu.org; Sun, 22 Dec 2024 19:43:17 -0500 Original-Received: from thaodan.de ([2a03:4000:4f:f15::1]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tPWXS-0004UP-E9; Sun, 22 Dec 2024 19:43:16 -0500 Original-Received: from odin (dsl-trebng12-50dc7b-49.dhcp.inet.fi [80.220.123.49]) by thaodan.de (Postfix) with ESMTPSA id 77E25D0008E; Mon, 23 Dec 2024 02:43:10 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=thaodan.de; s=mail; t=1734914590; bh=sGL9Hh/kXE+QPK4onRLUFgv84C6rIuy7K6g+sZId6iA=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=QKnTIzCMyPNLprFWtrNJW/4zCN57FUsCWt4LxBoJrLiTPT2uxAgle3vEkt0how9y/ bkjj274P2ls9XjzrQ8shGmKKZRu2OG2JXbKFNXe48xrXjOnfSckRp+LXbES9S7aSUy UL1NBeBdl0CJzDJkUNbhYvMqYXklyczA+ReLIqIF0pJsdFuNrXHmbCBrlvzMqsvtQf WpTaFXpMnQswVmfa68yFHvqAQWw8BLXj2AoXRtVrzoR7fUhZL9EyJ9BdUjBtcgELST nA3nbU+EXqJ3tzEn76NvZMhlMnqz+A0E7dcenL/+ein6OpDf8TodVMmr63jUdImT+h Ne1+kW2jsssXElKGdTDaq0NtttrxfU1swQWQYkc97cr4dvazsH5uZuf19Mi8DQfeUG Ks6qjxBZeGevBuEauVSu9fiIS1cYzJ8sGsJPDAXrbElZ+pvry5gzKhUvqcHQsbZwwK eCyO1PUq5x6ouupIFHbNDAb1DwdSFG3ExL02ZICc9x6khP7qbevpkozHypj8gX7tJ1 QORbzc9aflulfWtVi+AZtZrsJq0gFR2Wn3zSytMNDWO/dejLKx83zEuJcJMlK0lVdA J6yz/KbEysftQRxvLHmp5PwRlNJ3XzVysfJfyiJgWdqHCtDS37qczzjbxQNRO8zAnj W4150z8sTMkTJKsb+lgAR2KA= In-Reply-To: <00554790-CACA-4233-8846-9E091CF1F7AA@gmail.com> (Yuan Fu's message of "Fri, 20 Dec 2024 01:29:14 -0800") Autocrypt: addr=bjorn.bidar@thaodan.de; prefer-encrypt=nopreference; keydata= mDMEZNfpPhYJKwYBBAHaRw8BAQdACBEmr+0xwIIHZfIDlZmm7sa+lHHSb0g9FZrN6qE6ru60JUJq w7ZybiBCaWRhciA8Ympvcm4uYmlkYXJAdGhhb2Rhbi5kZT6IlgQTFgoAPgIbAwULCQgHAgIiAgYV CgkICwIEFgIDAQIeBwIXgBYhBFHxdut1RzAepymoq1wbdKFlHF9oBQJk1/YmAhkBAAoJEFwbdKFl HF9oB9cBAJoIIGQKXm4cpap+Flxc/EGnYl0123lcEyzuduqvlDT0AQC3OlFKm/OiqJ8IMTrzJRZ8 phFssTkSrrFXnM2jm5PYDoiTBBMWCgA7FiEEUfF263VHMB6nKairXBt0oWUcX2gFAmTX6T4CGwMF CwkIBwICIgIGFQoJCAsCBBYCAwECHgcCF4AACgkQXBt0oWUcX2hbCQEAtru7kvM8hi8zo6z9ux2h K+B5xViKuo7Z8K3IXuK5ugwA+wUfKzomzdBPhfxDsqLcEziGRxoyx0Q3ld9aermBUccHtBxCasO2 cm4gQmlkYXIgPG1lQHRoYW9kYW4uZGU+iJMEExYKADsCGwMFCwkIBwICIgIGFQoJCAsCBBYCAwEC HgcCF4AWIQRR8XbrdUcwHqcpqKtcG3ShZRxfaAUCZNf2FQAKCRBcG3ShZRxfaCzSAP4hZ7cSp0YN XYpcjHdsySh2MuBhhoPeLGXs+2kSiqBiOwD/TP8AgPEg/R+SI9GI9on7fBJJ0mp2IT8kZ2rhDOjg gA6IkwQTFgoAOxYhBFHxdut1RzAepymoq1wbdKFlH Received-SPF: pass client-ip=2a03:4000:4f:f15::1; envelope-from=bjorn.bidar@thaodan.de; helo=thaodan.de X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, INVALID_MSGID=0.568, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:326889 Archived-At: Yuan Fu writes: >> On Dec 20, 2024, at 1:13=E2=80=AFAM, Bj=C3=B6rn Bidar wrote: >>=20 >> Yuan Fu writes: >>=20 >>>> On Dec 18, 2024, at 5:34=E2=80=AFAM, Eli Zaretskii wrot= e: >>>>=20 >>>>> From: Yuan Fu >>>>> Date: Tue, 17 Dec 2024 14:11:51 -0800 >>>>> Cc: Peter Oliver , >>>>> Stefan Kangas , >>>>> Emacs Devel , >>>>> Eli Zaretskii >>>>>=20 >>>>>>> It=E2=80=99s also worth noting that Tree-sitter itself is somewhat >>>>>> immature; the developers say that until it reaches version 1.0, we >>>>>> should be wary of potentially unannounced incompatible changes >>>>>> (although they are trying harder to avoid this, over time). >>>>>>=20 >>>>>>=20 >>>>>> [1] https://build.opensuse.org/package/show/editors/tree-sitter >>>>>=20 >>>>> I wonder if we can formalize a way for tree-sitter major modes to >>>>> state the compatible version of language grammar it uses. Maybe a >>>>> package.el cookies, or a variable that set, or even just comments >>>>> in the beginning of the file. >>>>>=20 >>>>> Many major modes already adds entries to treesit-language-source-alis= t, that could be a good option too. >>>>>=20 >>>>> I especially want built-in major modes to give a version, so that >>>>> packagers can package Emacs with the right version of tree-sitter >>>>> grammar. I know Eli has problems with pinning a grammar version for >>>>> builtin modes before, but I wonder what=E2=80=99s he=E2=80=99s stance= now? >>>>=20 >>>> What's changed? >>>=20 >>> People are starting to package tree-sitter and tree-sitter >>> grammars. If Emacs can be packaged with the right grammars, then >>> tree-sitter modes will work out-of-the-box. >>=20 >> Please don't. That would require nodejs to build Emacs bundled with >> these grammars. These grammar packages are also not just used with >> Emacs. >>=20 >> Grammars are very easy to package once the infrastructure to reuse the >> packaging automation in the package manager is there. Don't try to >> reinvent that IMHO. If you must generated and build the parser implement >> a bindings.gyp parser so you can automate the compilation process >> independently of the grammar. > > There might be some misunderstanding. We don=E2=80=99t want to build the > grammars as part of building Emacs. Ideally building the grammars are > the package managers job. We just want to list the versions of > grammars that are known to work with the major modes, so packagers > have an easier time to package Emacs with the right version of > grammars. Ah ok now I understand. I don't think that would work. >>=20 >> For reference here's my implementation of it in python: >> https://build.opensuse.org/projects/editors:tree-sitter/packages/tree-si= tter/files/tree-sitter-target.py?expand=3D1 >>=20 >>>>=20 >>>> Many language grammars don't make official releases and thus don't >>>> have versions. Moreover, AFAIK there's no API to determine the >>>> version of the grammar library we load. So how can we manage such >>>> version-pinning in a way that (a) is up-to-date, and (b) doesn't >>>> preclude people from using a grammar library due to false negatives? >>>=20 >>> I=E2=80=99m talking about a softer pin. We=E2=80=99re basically providi= ng a =E2=80=9Cknown to >>> work=E2=80=9D version. This way packagers can package Emacs with a >>> known-to-work version of grammar, so the builtin modes work >>> out-of-the-box. This doesn=E2=80=99t prevent people from using a newer = version >>> and sending us a bug report, and we still try our best to make the >>> major modes work with the newest grammar. >>>=20 >>> If the grammar doesn=E2=80=99t have an explicit version, then we can ju= st use a commit hash. I believe all the packaging systems support that? >>=20 >> That doesn't make sense as the versions numbers are arbitrary, e.g. not >> always does the version number relate the changes to grammar but also to >> the in-tree dependencies in the repository packaging the >> language-grammar bindings which have nothing todo with the parser. > > Sure, let=E2=80=99s call it snapshot then. I just want to make sure when > packagers package Emacs with tree-sitter grammars, the grammar works > with Emacs=E2=80=99s major mode. The point was that now matter what you call the development of grammars is more or less fluent. Maybe there are some more mature grammar but those should be the minority. But lets just assume for a second it would be possible to freeze or recommend the supported grammar versions. The development of grammars is to fast for that, especially for builtin modes. >>=20 >> What matters much more is the tree-sitter version which is more related >> to Emacs itself rather than the particular version of the grammar. > > The tree-sitter library version is up to the packagers right? As long as = it satisfies Emacs=E2=80=99 requirements and is compatible with the bundled= grammars. Do mean bundled or recommended grammars? Grammars bundled would be again grammars included within the Emacs sources which is a different thing from what I you were saying further above. Yes the tree-sitter version is up to the package or respectively the distribution. The only issue that existed regarding was that tree-sitter once broke the ABI without bumping the sover but that's fixed now or was fixed when Emacs correctly rebuilt once a dependency of it changed.