From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter maturity Date: Sun, 29 Dec 2024 16:56:51 -0800 Message-ID: References: <1ed88fca-788a-fe9f-b6c8-edb2f49751c9@mavit.org.uk> <67428b3d.c80a0220.2f3036.adbdSMTPIN_ADDED_BROKEN@mx.google.com> <86ldwdm7xg.fsf@gnu.org> <6765355b.c80a0220.1a6b24.3117SMTPIN_ADDED_BROKEN@mx.google.com> <00554790-CACA-4233-8846-9E091CF1F7AA@gmail.com> <86msgl2red.fsf@gnu.org> <87o710sr7y.fsf@debian-hx90.lan> <8734i9tmze.fsf@posteo.net> <86plldwb7w.fsf@gnu.org> <86h66pw4sd.fsf@gnu.org> <947A7DB0-43F7-4288-8FBF-0984FCFFAA93@dancol.org> <663726A2-141B-4B98-80FB-BD93E99AC122@dancol.org> <6771d84b.050a0220.250914.d0e0SMTPIN_ADDED_BROKEN@mx.google.com> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8258"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Daniel Colascione , Eli Zaretskii , emacs-devel@gnu.org To: =?utf-8?Q?Bj=C3=B6rn_Bidar?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Dec 30 01:58:01 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tS46b-000207-A8 for ged-emacs-devel@m.gmane-mx.org; Mon, 30 Dec 2024 01:58:01 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tS45p-0000XN-Ag; Sun, 29 Dec 2024 19:57:13 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tS45n-0000XB-Du for emacs-devel@gnu.org; Sun, 29 Dec 2024 19:57:11 -0500 Original-Received: from mail-pl1-x62c.google.com ([2607:f8b0:4864:20::62c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tS45h-0008Pc-Of; Sun, 29 Dec 2024 19:57:11 -0500 Original-Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-216395e151bso80976145ad.0; Sun, 29 Dec 2024 16:57:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1735520224; x=1736125024; darn=gnu.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ItPf0N6p7OWCeLmIccLos+1tPjefXwFAoOePbqHHcbo=; b=GAWtdS7qmFgPeODttAarCPKTZw0m7xBgTFr9Vc+JbcWkT/efP0dm0vZmWKSIvsxIAL 3b5GLvZC2ygqgasVkKns6O4mGQjq6VRRye2U2N84HxMWMoTDwhxiN1uS/yKxluZz1gJJ UESV+UPznxRBWA0r6pCcd9kvB7VzXKlu3oSfpKKsvRSqodUd7OiRwngIvahaholeXDnd XJvo1fT6fCFvf39315uWd9SHMTPzeCtGvHpnLwt2WTJox/km61jh6XMIfCvcJBLTc+Ts krej5v55a7pybnqb3J87zRfCgu9ndCKuCVAe2yGQKT4dNwYr4fFZsz9/4KTmrCLaiFI2 OE4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735520224; x=1736125024; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ItPf0N6p7OWCeLmIccLos+1tPjefXwFAoOePbqHHcbo=; b=iMU5QUVAd/tftv++4jE4UP4TS9pPQ+KxhUShOIATwY3peQ9JdjnJHjRdb+/T6v8x1F Xz/xxvpakN5zRWwzlktI+zUmN1QVlkNLDHbk0kUWbBCjiqRLF+T8AEvNwoQ6APkRbv2Z kbzgvrxnFlPXj24Irqy5B/kqMxcOGEeBXaWwEG5Y9SH+aeFx/dzlrgK8yj03bXwpBNkk 2TbGeS75OBthWRj+T+TpzBwg/Do6U5UyFiNCJNuIg1tH/H/soS0tTn7HriufX1R0Ngcc UQNG0BQE/ml45h/d8RSuv/AvARMJ2+xwgepDwK/I7UxSIR0LCuP8qEcR7fJ8aY1y92Mb khRg== X-Forwarded-Encrypted: i=1; AJvYcCW6ez3v92tApswDw5yhRp15XcUKEjERdqgHzqs27+1eQHmmSBfLhEdhZ/5ZGusZiNpD7qYsLkIrVrDCmbI=@gnu.org, AJvYcCXtUTf7LJDZOPJUK9BCFajTePxTup/s8poJPuzkwDowKenvmp3FjP5XGBepTox5hPstTIuy@gnu.org X-Gm-Message-State: AOJu0YwTIcyo0eQS2AUcj3UpP9xQqkvGx/PBHbZFZml+vmyNivexBpUN DksOWdK5z8XfyNEmwPgVFk7vRxeY3Ppo0rTcGU5vMVCvBpOq4cdurEoAXA== X-Gm-Gg: ASbGncsQb15lCW1M7QbDjLB2vnI+mIeVWDwd0QWxAaElKjMjgh/J1NgtUiRLGvEd0w5 VViE4dNx4ejjzOofVWOpBL4Qx2UANrL9DjkzVhA344ZJrVAEOkg21FayzMMMoFMl2dhMz7WMug4 w3lBKDqbqtPdTHjYrgGT5Cule6Oe/3ejV8snEVbEHPZf9/bUF+XAKgc8/K58p34vzWtueiDXyL8 59p0pEH+bk5cY50zRevFndSj+cZOKjtzMOjGcef9ZuZSRYzeaK8beVk5nhKmEh5S6B6I/GcyVr9 BoxN X-Google-Smtp-Source: AGHT+IGOGDyPn2+jW2NGcfkBuVg1VvbEq/Mz9y90NkifkcTELDtKJfCNrUBEPqdxdyuzvJO+taLOlQ== X-Received: by 2002:a17:903:22c8:b0:215:3998:189f with SMTP id d9443c01a7336-219e6ca6d97mr446004315ad.6.1735520223644; Sun, 29 Dec 2024 16:57:03 -0800 (PST) Original-Received: from smtpclient.apple ([2601:646:8f81:6120:f1c9:d034:5332:4d9a]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-219dc9d4495sm168274115ad.113.2024.12.29.16.57.02 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 29 Dec 2024 16:57:02 -0800 (PST) In-Reply-To: <6771d84b.050a0220.250914.d0e0SMTPIN_ADDED_BROKEN@mx.google.com> X-Mailer: Apple Mail (2.3776.700.51) Received-SPF: pass client-ip=2607:f8b0:4864:20::62c; envelope-from=casouri@gmail.com; helo=mail-pl1-x62c.google.com X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:327380 Archived-At: > On Dec 29, 2024, at 11:06=E2=80=AFAM, Bj=C3=B6rn Bidar = wrote: >=20 > Daniel Colascione writes: >=20 >> On December 29, 2024 11:02:47 AM EST, "Bj=C3=B6rn Bidar" = wrote: >>> Daniel Colascione writes: >>>=20 >>>> On December 29, 2024 10:05:26 AM EST, "Bj=C3=B6rn Bidar" >>>> wrote: >>>>> Daniel Colascione writes: >>>>>=20 >>>>>> On December 27, 2024 9:59:14 AM EST, Eli Zaretskii = wrote: >>>>>>>> Date: Fri, 27 Dec 2024 08:46:06 -0500 >>>>>>>> From: Daniel Colascione >>>>>>>> CC: rms@gnu.org, manphiz@gmail.com >>>>>>>>=20 >>>>>>>>>> It might take a while for that to happen, which is why I = still >>>>>>>>>> believe >>>>>>>>>> it would be better if tree-sitter major modes would populate >>>>>>>>>> `treesit-language-source-alist' on their own, and point to = the >>>>>>>>>> specific >>>>>>>>>> checkouts that the major mode developer tested their = implementation >>>>>>>>>> against. >>>>>>>>>=20 >>>>>>>>> We could have done that, but there's no way we could keep the = value of >>>>>>>>> treesit-language-source-alist up-to-date, because the grammar >>>>>>>>> libraries put out new versions much more frequently than Emacs >>>>>>>>> releases, especially if you consider libraries that have no = official >>>>>>>>> versions at all (in which case we can only point to some = revision in >>>>>>>>> their repository). >>>>>>>>>=20 >>>>>>>>> The question that bothers me is how useful is it to have >>>>>>>>> treesit-language-source-alist that is outdated? What do we = expect the >>>>>>>>> users to do with such an outdated value? >>>>>>>>>=20 >>>>>>>>=20 >>>>>>>> Why not just vendor all the grammars with the Emacs modes that >>>>>>>> use them? >>>>>>>=20 >>>>>>> We'd need to ask their developers to agree to this.=20 >>>>>>=20 >>>>>> Why? They're free software. For copyright assignment? Seems like = an >>>>>> exception would make sense here. >>>>>>=20 >>>>>>> Other than that, >>>>>>> I don't see how is that different from pointing to a specific = version >>>>>>> of each grammar: both will be outdated a short time after we = point to >>>>>>> the version or release Emacs with that version. >>>>>>>=20 >>>>>>> So why do you think this is better? >>>>>>=20 >>>>>> Vendoring enables building a full featured Emacs without a = network >>>>>> connection and guarantees build reproducibility in perpetuity. >>>>>=20 >>>>> Did you think of the long term consequences? >>>>>=20 >>>>> The embedded dependencies would have to be maintained first by = Emacs and >>>>> later by packagers. >>>>>=20 >>>>> All the infrastructure around syncing of grammars is time spend = that >>>>> could spend on more long term efforts such as stabilizing the >>>>> tree-sitter based modes to not break as easy on grammar changes or = to >>>>> improve tree-sitter it self. >>>>=20 >>>> I've vendored plenty of things. Works fine in practice. Big = programs >>>> like Firefox vendor the world too, and they work fine. It's really = not >>>> that much work. It eliminates entire classes of problem. It's going = to >>>> take more time to deal with the problems of taking a dependency and >>>> the headaches of not having a stable interface than it would to set = up >>>> a few git subtrees or submodules and invoke their build system from >>>> that of Emacs. >>>=20 >>> Big programs like Firefox vendor the world only for packagers to = have >>> to revert those.=20 >>=20 >> These packagers are wrong, FWIW. Unbundling is needless and often >> introduces bugs. >=20 > It introduces bugs in software with generally unstable APIs/ABIs or = software > using unfinished/unreleased versions of software. Partially the = problem > we are facing here right now. >=20 > The rat tail of issues this can entrail can be long. >=20 > FWIW You called over half of the Unix community wrong where bundled > dependencies are frowned or forbidden upon. >=20 >> In the mobile world, popular OSes seldom provide >> libraries. Apps are expected to bundle their dependencies. The sky >> doesn't fall. In fact, the mobile app ecosystem is healthier and more >> secure than the desktop one precisely because it isn't burdened by >> ideas that no longer make sense in a modern technical context, >> e.g. that apps should casually share libraries. >=20 > I work on mobile operating systems, what you describe is double edged > sword. The applications size's increase and the party to blame for > security issues moves from the os to the application developer. The > mobile OS would had plenty of issues from this practice notably for > example the log4you debacle. >=20 > Most mobile operating systems provide their own set of available > libraries, apps are not expected to bundle dependencies unless they = are > not available for that OS. >=20 > Part of the issue is that library dependencies moving faster than many > operating systems can or with stable APIs. The end result of such lets > bundled everything approach is that you have use the exact chain of > dependencies to build a software which is awesome if you like the fire > and forget approach of software development. >=20 >=20 > To me what you write reads like mobile operating systems =3D = JavaScript/AI developers. >=20 >>> Vendoring only works long enough until the dependencies >>> you have vendored are not out of date.=20 >>=20 >> It doesn't matter whether the dependency is out of date so long as >> it's in sync with code that interacts with it. It's even worse when >> the dependency doesn't make any compatibility guarantees. IMHO, the >> only reasonable way to consume a dependency with an unstable = interface >> is to bundle or hash-lock or outright vendor it. >=20 > If you have software which has short life cycles this can work but I > don't think this works for Emacs. > Further bundled dependencies require to patch the software bundling = the > dependency to fix bugs and security issue. Bugfixes are not really an > issue with grammars but with libtree-sitter which Emacs depend on. > Putting bundled grammars and libtree-sitter in this equation makes it > harder to maintain the Emacs package since I have to watch to not = break > the embedded grammars when updating tree-sitter or it's dependencies. >=20 >>> It is something which only works >>> in projects who control most of their dependency chain and/or have a >>> fire and forget approach of software development. >>>=20 >>>> It's not even the precise mechanics: pulling down a grammar by hash = is >>>> tantamount to just checking in the grammar, but with more moving >>>> parts. You still pair one to one the grammar and the Lisp code = meant >>>> to use it so you don't end up chasing down weird compatibility >>>> issues. IMHO, since they're tightly coupled anyway, we might as = well >>>> distribute them together. >>>>=20 >>>> As for changing TS grammars not to break: why do you think that = would >>>> be feasible? So far TS grammar authors haven't felt particularly >>>> obligated to maintain compatibility. >>>=20 >>> I don't know exactly to be honst but I don't think we are alone with >>> this issue. If we are we should check out it is handled in other >>> editors. >>=20 >> You're right that this is a problem everyone should be hitting. >=20 > Maybe there's a way around it. The only way is to reach out to other > projects using the library or upstream. Nvim vendors grammars, it also has a =E2=80=9Cdatabase=E2=80=9D repo = that pins grammars (https://github.com/nvim-treesitter/nvim-treesitter), = that repo also provides a command to install grammars. IIUC Pulsar (community-supported Atom successor) vendors grammars too, = because that=E2=80=99s what Atom did originally. Helix I think only provide a command to install grammars (plus their = database pinning grammar versions). For context, Emacs also has a command to install grammars, but we = don=E2=80=99t provide a database nor version pinning. Nvim and Pulsar have plans to move towards using wasm grammars. = Tree-sitter recently gained the ability to compile grammars into wasm = object files and load wasm grammars. Pulsar is built on electron so = naturally has access to wasm, nvim is adding a wasm runtime as an = optional dependency. Yuan=