From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter maturity Date: Mon, 23 Dec 2024 17:20:28 -0800 Message-ID: References: <1ed88fca-788a-fe9f-b6c8-edb2f49751c9@mavit.org.uk> <67428b3d.c80a0220.2f3036.adbdSMTPIN_ADDED_BROKEN@mx.google.com> <86ldwdm7xg.fsf@gnu.org> <6765355b.c80a0220.1a6b24.3117SMTPIN_ADDED_BROKEN@mx.google.com> <00554790-CACA-4233-8846-9E091CF1F7AA@gmail.com> <6768b256.c80a0220.222b1b.64e6SMTPIN_ADDED_BROKEN@mx.google.com> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="28083"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , Peter Oliver , Stefan Kangas , emacs-devel@gnu.org To: =?utf-8?Q?Bj=C3=B6rn_Bidar?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Dec 24 02:21:32 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tPtc4-0007Cf-BP for ged-emacs-devel@m.gmane-mx.org; Tue, 24 Dec 2024 02:21:32 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tPtbJ-0001hB-Ai; Mon, 23 Dec 2024 20:20:45 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tPtbI-0001gv-Ci for emacs-devel@gnu.org; Mon, 23 Dec 2024 20:20:44 -0500 Original-Received: from mail-pj1-x102f.google.com ([2607:f8b0:4864:20::102f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tPtbG-0001AQ-J9; Mon, 23 Dec 2024 20:20:44 -0500 Original-Received: by mail-pj1-x102f.google.com with SMTP id 98e67ed59e1d1-2ee8e8e29f6so3972234a91.0; Mon, 23 Dec 2024 17:20:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1735003241; x=1735608041; darn=gnu.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kLApAnfqB3WHjvZ0doAezu1SKauWQRURMsWdyzpNu1o=; b=HCFhVoGgzQYlyFng4W+kTNcNVnh6MND5zFmsOP7PkRQ6xT4rVRLWn/FXSWBVFr9/KE oJNybRqhChtLueI/4XoZtgireVTS2GfidiHXDjfTjGla9eqHHhDZStfRDKrvegQNZiIs iunRKPF+AiGhXR91p7SM2AKFARSN/pgcYMRDPPPEHHeuL3S+uf+KgWl7QEDMHrvbfXB0 lnfUQg+XN/Vt1t3aeQOLBtMjKQL9M0nQI/eEa14u4ZVQS4Dprga0gSsjjhnAJeF6vNiz A2ZP/ddwMoJjAkDJ3Y4iBh2gEN50Vpd1LPPHHGIeR0klP/A39ejEdk/L0lWHr6Kcd5fE GhjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735003241; x=1735608041; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kLApAnfqB3WHjvZ0doAezu1SKauWQRURMsWdyzpNu1o=; b=MdDHfy4Em0zo/94QN8Nv7UWLMsYBsUBvYDckoCvUphphY84jaVcHyWj0OcB98cmIo7 1M4s8MweBoRmEZZeJYLwKKi0LZqNVSkfULqMMzU4WPqAcy496cpZhgc2zB/msh/4vxUl 5yg6/2KR880YUaGKLMpP/dqH3qQbL2+aWG0YhvwuXBARBlglidx1Ahox3OJuGdnunBUY yB1cMwdJ4pqG6KtDNRSSe3h4fenpD7bSofzplWWm/H9XyudvUjfGpkTT7+Fl95JH7gCQ FHrzSbqPmUl1l7mylBaCBVf+cRQf8toSiJK6rQ3EGIAbm2NQ96h3SG0v0u1wb9f7+RKa Z2Fg== X-Forwarded-Encrypted: i=1; AJvYcCUygIROWCQ+DXLbs1xf6UrH8xHuHHdZvhqiuAw63RTmC0OtuKFaSflJhUtfKaeAlwvS/vcNkOnBT8Q97w==@gnu.org X-Gm-Message-State: AOJu0YxnmTR8aR1RrXhtSfaqQl1RHZTxZJfsXXiXyv7tEJAMi/Oi1maK LFbzMe9dRzjoC0yrorRLzXdqcymrOf09r2J9d6eBa0K8mHb6qcRGzig0qg== X-Gm-Gg: ASbGncv5pQSdX1viR2I6Wt4sF0+/kbADhmAE7A4LjvJHXyY4FZqjPhFYCvPPAz/bCj4 b/ZrYktsTrHz9gkK12bi4k6nWR5zQVqvUzbkrkc2nLYWC2CxKAAAaZmHc/FXcIkr4pSDSNrvfKQ OQm5IgjYPjO78ay6MEFBsYYudK30Gd4FK945ez90pXkG10PAcBrA+LCafr3v4fGiESfk3NKQCQV PyLmIl6X0mPRfWGNk73lgJaQmO+s9AOG0oc5hXet8MFACrVrpStZD3vqol68rjEJW+Z/BUGDYg7 QJ96 X-Google-Smtp-Source: AGHT+IFWGgu3EA286rr2qOAjeZI0ZLRCesoeGWt2pP5NCn9husuiKdCVVEfrga/ZRVHdvK1dT9CsLg== X-Received: by 2002:a05:6a00:130e:b0:729:643:744f with SMTP id d2e1a72fcca58-72abe18bcf4mr21838705b3a.25.1735003240631; Mon, 23 Dec 2024 17:20:40 -0800 (PST) Original-Received: from smtpclient.apple ([2601:646:8f81:6120:1d98:6810:9846:b152]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72aad8fccd5sm8588533b3a.168.2024.12.23.17.20.39 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Dec 2024 17:20:39 -0800 (PST) In-Reply-To: <6768b256.c80a0220.222b1b.64e6SMTPIN_ADDED_BROKEN@mx.google.com> X-Mailer: Apple Mail (2.3776.700.51) Received-SPF: pass client-ip=2607:f8b0:4864:20::102f; envelope-from=casouri@gmail.com; helo=mail-pj1-x102f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:326942 Archived-At: > On Dec 22, 2024, at 4:43=E2=80=AFPM, Bj=C3=B6rn Bidar = wrote: >=20 > Yuan Fu writes: >=20 >>> On Dec 20, 2024, at 1:13=E2=80=AFAM, Bj=C3=B6rn Bidar = wrote: >>>=20 >>> Yuan Fu writes: >>>=20 >>>>> On Dec 18, 2024, at 5:34=E2=80=AFAM, Eli Zaretskii = wrote: >>>>>=20 >>>>>> From: Yuan Fu >>>>>> Date: Tue, 17 Dec 2024 14:11:51 -0800 >>>>>> Cc: Peter Oliver , >>>>>> Stefan Kangas , >>>>>> Emacs Devel , >>>>>> Eli Zaretskii >>>>>>=20 >>>>>>>> It=E2=80=99s also worth noting that Tree-sitter itself is = somewhat >>>>>>> immature; the developers say that until it reaches version 1.0, = we >>>>>>> should be wary of potentially unannounced incompatible changes >>>>>>> (although they are trying harder to avoid this, over time). >>>>>>>=20 >>>>>>>=20 >>>>>>> [1] https://build.opensuse.org/package/show/editors/tree-sitter >>>>>>=20 >>>>>> I wonder if we can formalize a way for tree-sitter major modes to >>>>>> state the compatible version of language grammar it uses. Maybe a >>>>>> package.el cookies, or a variable that set, or even just comments >>>>>> in the beginning of the file. >>>>>>=20 >>>>>> Many major modes already adds entries to = treesit-language-source-alist, that could be a good option too. >>>>>>=20 >>>>>> I especially want built-in major modes to give a version, so that >>>>>> packagers can package Emacs with the right version of tree-sitter >>>>>> grammar. I know Eli has problems with pinning a grammar version = for >>>>>> builtin modes before, but I wonder what=E2=80=99s he=E2=80=99s = stance now? >>>>>=20 >>>>> What's changed? >>>>=20 >>>> People are starting to package tree-sitter and tree-sitter >>>> grammars. If Emacs can be packaged with the right grammars, then >>>> tree-sitter modes will work out-of-the-box. >>>=20 >>> Please don't. That would require nodejs to build Emacs bundled with >>> these grammars. These grammar packages are also not just used with >>> Emacs. >>>=20 >>> Grammars are very easy to package once the infrastructure to reuse = the >>> packaging automation in the package manager is there. Don't try to >>> reinvent that IMHO. If you must generated and build the parser = implement >>> a bindings.gyp parser so you can automate the compilation process >>> independently of the grammar. >>=20 >> There might be some misunderstanding. We don=E2=80=99t want to build = the >> grammars as part of building Emacs. Ideally building the grammars are >> the package managers job. We just want to list the versions of >> grammars that are known to work with the major modes, so packagers >> have an easier time to package Emacs with the right version of >> grammars. >=20 > Ah ok now I understand. I don't think that would work. >=20 >>>=20 >>> For reference here's my implementation of it in python: >>> = https://build.opensuse.org/projects/editors:tree-sitter/packages/tree-sitt= er/files/tree-sitter-target.py?expand=3D1 >>>=20 >>>>>=20 >>>>> Many language grammars don't make official releases and thus don't >>>>> have versions. Moreover, AFAIK there's no API to determine the >>>>> version of the grammar library we load. So how can we manage such >>>>> version-pinning in a way that (a) is up-to-date, and (b) doesn't >>>>> preclude people from using a grammar library due to false = negatives? >>>>=20 >>>> I=E2=80=99m talking about a softer pin. We=E2=80=99re basically = providing a =E2=80=9Cknown to >>>> work=E2=80=9D version. This way packagers can package Emacs with a >>>> known-to-work version of grammar, so the builtin modes work >>>> out-of-the-box. This doesn=E2=80=99t prevent people from using a = newer version >>>> and sending us a bug report, and we still try our best to make the >>>> major modes work with the newest grammar. >>>>=20 >>>> If the grammar doesn=E2=80=99t have an explicit version, then we = can just use a commit hash. I believe all the packaging systems support = that? >>>=20 >>> That doesn't make sense as the versions numbers are arbitrary, e.g. = not >>> always does the version number relate the changes to grammar but = also to >>> the in-tree dependencies in the repository packaging the >>> language-grammar bindings which have nothing todo with the parser. >>=20 >> Sure, let=E2=80=99s call it snapshot then. I just want to make sure = when >> packagers package Emacs with tree-sitter grammars, the grammar works >> with Emacs=E2=80=99s major mode. >=20 > The point was that now matter what you call the development of = grammars > is more or less fluent. Maybe there are some more mature grammar but > those should be the minority. > But lets just assume for a second it would be possible to freeze or > recommend the supported grammar versions. The development of grammars = is > to fast for that, especially for builtin modes. >=20 >>>=20 >>> What matters much more is the tree-sitter version which is more = related >>> to Emacs itself rather than the particular version of the grammar. >>=20 >> The tree-sitter library version is up to the packagers right? As long = as it satisfies Emacs=E2=80=99 requirements and is compatible with the = bundled grammars. >=20 > Do mean bundled or recommended grammars? Grammars bundled would be = again > grammars included within the Emacs sources which is a different thing > from what I you were saying further above. Recommended. So packagers control the version of both tree-sitter lib = and grammars. Emacs will recommend version or commit hash of grammars, = and packagers will provide Emacs with the grammars that work with the = builtin major modes. >=20 > Yes the tree-sitter version is up to the package or respectively the > distribution. > The only issue that existed regarding was that tree-sitter once broke > the ABI without bumping the sover but that's fixed now or was fixed = when > Emacs correctly rebuilt once a dependency of it changed. Yeah, hopefully they=E2=80=99ll be more careful in the future. Yuan=