From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter introduction documentation Date: Tue, 27 Dec 2022 16:11:22 +0200 Message-ID: <71cfe4e8-3bb8-b0a6-9be5-8c0a6d92cfab@yandex.ru> References: <83edszjslp.fsf@gnu.org> <87tu1vxs3a.fsf@ledu-giraud.fr> <831qozjob7.fsf@gnu.org> <87cz8jxoat.fsf@ledu-giraud.fr> <83wn6ri7pn.fsf@gnu.org> <5e0a3185-de82-b339-0fa2-956779e63d6f@cornell.edu> <868rj6vfep.fsf@gmail.com> <4895891b-e5ea-9c37-f51b-df2e479ee758@yandex.ru> <83y1qt11xq.fsf@gnu.org> <9eb013da-d0fc-8e17-c6e3-1e8f913aebfa@yandex.ru> <83pmc50xxc.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27201"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Cc: monnier@iro.umontreal.ca, theophilusx@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Dec 27 15:12:09 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pAAgb-0006us-5U for ged-emacs-devel@m.gmane-mx.org; Tue, 27 Dec 2022 15:12:09 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pAAfz-0007Pr-4A; Tue, 27 Dec 2022 09:11:31 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pAAfx-0007Ot-Er for emacs-devel@gnu.org; Tue, 27 Dec 2022 09:11:29 -0500 Original-Received: from mail-wm1-x32b.google.com ([2a00:1450:4864:20::32b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pAAfv-00049M-59; Tue, 27 Dec 2022 09:11:29 -0500 Original-Received: by mail-wm1-x32b.google.com with SMTP id fm16-20020a05600c0c1000b003d96fb976efso7094877wmb.3; Tue, 27 Dec 2022 06:11:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc:subject:date:message-id:reply-to; bh=FVmLrf+g039aBFixs9jH7dnakFCoeqKAB2smEhw7/ZQ=; b=gUSOdGtvwFEeGgKEcM5apDRPBKp5zbuJePlGorI2LHJSardS9irwXqcG0w3ZYlKImJ 2yYIO0921cYyOekeuPKGP1BE7BUblIm9/ZvAepduNNpQCZsUqbJmrhJmyI/ksw/uE2X7 NTQCGC/msl/wtRBsPekii8Ia0qICSoz74k9uHlUBD7lkb1kv7lejB5UDN9phJIW/u56m dxPyZiwWA2x6wxoF03wvVn7S+H1345cZ2tqH+z4CeQKayyDAfdQQR+rnYgdDtcuki7HV qJNECfvFWuDxoFe6WHIy1/lbte6lRsJl+ecQidX08zK0cZToycDxProMVq4h6wP4JboL jwbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FVmLrf+g039aBFixs9jH7dnakFCoeqKAB2smEhw7/ZQ=; b=F6WC9rvLYcaJH47FtrOC+DVUlYQm4DALTBOjHLQCTO2jZUBghLGkA35d9/kaBqc9uQ 06UgQFuTpxK3BKCcwMXMwww7PZ93tkQEBibfV0hoUD68mTPjbvxjQeV7jERaSuPEX5eM dWGKTSYWqAlyhy5saF6WfhbX4TtVI0LE2Xtzj5ewl0b99uaGfeZQ+XnMiRjoMpBbc/uh v36q6oH1inBgGYDbqAKIicmxpuBNJU+AxBEsqq5c9Rop/SaKColuht1fQHUEfjBIl+mI V3s4NvZsC0OQ7h3G+kImlDyyhfPWl/0G/fVNmNr8xRII7tLWhuXqW76KXJNgIKZsTTpI QjFQ== X-Gm-Message-State: AFqh2krHhDjd1N8vDKrYNVRt0efp1chxGEmrfiizKgpiv51LImvNeYYX j7kMRqLXU0dIHPtKeWaRsGKLKaFdlV4= X-Google-Smtp-Source: AMrXdXvHvnF1OxTWWvhMawjYbn7KF7wATYK3siaZ5jfW0NIB0B9m+uEB6lzkr1J+LgAUSyri315+Gg== X-Received: by 2002:a05:600c:8a9:b0:3cf:6e85:eda7 with SMTP id l41-20020a05600c08a900b003cf6e85eda7mr15539110wmp.14.1672150284960; Tue, 27 Dec 2022 06:11:24 -0800 (PST) Original-Received: from [192.168.0.2] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id s13-20020adfdb0d000000b002420dba6447sm13055650wri.59.2022.12.27.06.11.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Dec 2022 06:11:24 -0800 (PST) Content-Language: en-US In-Reply-To: <83pmc50xxc.fsf@gnu.org> Received-SPF: pass client-ip=2a00:1450:4864:20::32b; envelope-from=raaahh@gmail.com; helo=mail-wm1-x32b.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, NICE_REPLY_A=-1.147, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:301961 Archived-At: On 27/12/2022 15:38, Eli Zaretskii wrote: >> Date: Tue, 27 Dec 2022 14:43:06 +0200 >> Cc: monnier@iro.umontreal.ca, theophilusx@gmail.com, emacs-devel@gnu.org >> From: Dmitry Gutov >> >>> WDYT about what we have in NEWS about this? >> >> Those instructions seem to be written foremost with distro maintainers >> in mind. > > Or users who build and install Emacs by themselves. > > Frankly, I don't see why we, the upstream project, need to worry about > anyone else. It isn't our job. That's what distros are there for. Previously all one needed for a language support mode is to download and load an .el file. As we drift to the idea of using externally-maintained grammars, and the "native" modes become less useful (possibly deprecated in 5-10), it seems like it will become more of our responsibility to streamline. >> Definitely better to have them than not, but I'd hate to present >> them to the average user. > > "Hate"? That's a strong word. Also a figure of speech. > The questions that the NEWS entries > answer were asked here and elsewhere several times, so presumably that > information has some non-trivial value. Of course. >> Do we expect all (most?) distros to compile all the popular grammars? > > I honestly don't know. On the one hand, there aren't many Emacs modes > which use tree-sitter, but OTOH they could start growing like > mushrooms once Emacs 29 hits the streets. I do expect them to offer > the ones they consider useful/needed, for some value of that. I > really don't see any significant difference in this regard between > grammar libraries and, say, librsvg. Both are used in Emacs, and the > lack of either disables useful Emacs features. So it's a no-brainer > for me. But then I'm not a distro maintainer, and never have been. On the flip side, the third-party modes can provide their own download-compile-install instructions, which will make it easier to end users. The barrier to creating such a mode, though, is now higher. >> That would still leave out the users of the less popular languages whose >> grammars were not included. Or grammars which saw updates since the >> distro-distributed version (so it's useful to install the newer version). > > What's the solution? All the "solutions" I saw until now require a > working and well-configured C/C++ compiler (sometimes both C and C++), > linker, and C/C++ runtimes. A user who has them already installed can > easily build a grammar library with two simple commands. A user who > doesn't have a C/C++ development environment will not find those > "solutions" useful at all. And asking us to distribute binaries for > half a dozen popular systems is IMNSHO unreasonable. I think it's common enough for a user to have build tools installed, but not know well enough how to set up a C project. Think junior-middle developers in a number of languages which are not C. Or just first grade students. >> I wouldn't worry too much about the maintenance burden (keeping the list >> of urls up-to-date?), especially since we could refer to such lists by >> other projects. > > I cannot disagree more. Look at this from my POV: once the list > becomes even semi-official, people will expect it to be of the same > high quality as all the rest of Emacs, and they _will_ complain and > report inaccuracies. It's a nuisance, especially for such a "hot" > feature set. They will report inaccuracies, which will be helpful to fixing them. That is certainly a workload, but still small compared to the current flow of bug reports, I think. Or the many hours one would spend fixing a font- or redisplay-related problem. Anyway, my point was not to put this burden on you specifically. If you might recall, I've always advocated toward "smaller core with many plugins" as a model of Emacs development. > And which "other projects"? who can track those and know which ones > have the most accurate, up-to-date, and comprehensive list? I'm a bit > interested in this (and have several dozens of grammar libraries built > locally), and I discover another project with a useful list of > grammars almost every day. These things are highly dynamic: I see > some of the grammars get updates every couple of days. Some languages > have more than one grammar library maintained by different people -- > who will figure out which one is better for us and keep that > information up-to-date? The Neovim repo will likely be a good resource for this in the near future. This file in particular: https://github.com/nvim-treesitter/nvim-treesitter/blob/f2b1d727e6ad46238baa84c4d1f968a297e415ab/lua/nvim-treesitter/parsers.lua But it brings me to another concern, showcased by this commit: https://github.com/nvim-treesitter/nvim-treesitter/commit/0cb637ca9f4389172933e5aba36387ab8430b6fb The AST for one version of a grammar might be incompatible enough with a newer one, making the TS queries, font-lock and indentation rules obsolete or at least slightly broken. nvim-treesitter works around this by locking the repository version of a grammar corresponding to the current language support code. How much this will be a problem in practice for us? I'm not sure. Perhaps most popular grammars have had enough time to mature by now. >> I think ELPA is a better place for this feature, though. Because we >> always want the user to get the latest version of the recipes. > > That solves only part of the problem. (And not an important part: our > Git repository is public, so people can track it and download updates > for files as easily as they track ELPA.) One is not exactly like the other from an end user's POV. > The hard part -- keeping the > information accurate and up-to-date -- still needs a motivated > volunteer. And we hardly have resources to work on our code and docs, > let alone help people install external software. > > (Of course, if such a motivated volunteer steps forward, he or she > will be most welcome.) My guess is we have a few people here already who might be interested.