From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Hugo Thunnissen Newsgroups: gmane.emacs.devel Subject: Re: Update on tree-sitter structure navigation Date: Sat, 02 Sep 2023 10:50:46 +0200 Message-ID: <87h6oddkm1.fsf@hugot.nl> References: <5E7F2A94-4377-45C0-8541-7F59F3B54BA1@gmail.com> <87h6odhxs6.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="24465"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Yuan Fu , emacs-devel , Danny Freeman , Theodor Thornhill , Jostein =?utf-8?Q?Kj=C3=B8nigsen?= , Randy Taylor , Wilhelm Kirschbaum , Perry Smith , Dmitry Gutov To: Ihor Radchenko Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Sep 02 10:52:28 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qcMMl-000689-HO for ged-emacs-devel@m.gmane-mx.org; Sat, 02 Sep 2023 10:52:27 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qcMLn-0008O9-9b; Sat, 02 Sep 2023 04:51:27 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qcMLk-0008MS-Kc for emacs-devel@gnu.org; Sat, 02 Sep 2023 04:51:24 -0400 Original-Received: from mailtransmit04.runbox.com ([2a0c:5a00:149::25]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qcMLh-0008Bn-8Y for emacs-devel@gnu.org; Sat, 02 Sep 2023 04:51:24 -0400 Original-Received: from mailtransmit02.runbox ([10.9.9.162] helo=aibo.runbox.com) by mailtransmit04.runbox.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1qcMLU-00CWZU-1R; Sat, 02 Sep 2023 10:51:08 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=hugot.nl; s=selector1; h=Content-Transfer-Encoding:Content-Type:MIME-Version: Message-ID:Date:References:In-Reply-To:Subject:Cc:To:From; bh=T9AA7Js2YPdodIpayApVgtndLpijY7lYkvp1ZVmHU7s=; b=R5nYdAII1iw7t4cYzU5kb4tcrj C9SGZd793frcEjt2jxd4XQOzc1b1euBafL/J0k5FE8ejSmly+i3KiOJcqsqQhNcbNBZMBzKbBiQar 1PQ2QrBUpq+I9cwzlcNS7/ejpwFvu2aAaPFFmfEUlbeHBsDkj4kf233bIbwizKjcxBy2RHVBHM7q8 MIFd6kcUPViZimNGFoh9DyBPcjAM2wk2FL0SiQ8bt2dfXkFYVJfkHiPANsFJ4zkk9iOeQn9Ba8Mqn 1oV42oD0b8Sek5KDOK+fy7H/zOH7SHCf2H+V/AYbjHLwABza1Z73WhwQV8bsYYjvtNas75S6sh8+M LTOKWP/A==; Original-Received: from [10.9.9.74] (helo=submission03.runbox) by mailtransmit02.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1qcMLS-000249-VZ; Sat, 02 Sep 2023 10:51:07 +0200 Original-Received: by submission03.runbox with esmtpsa [Authenticated ID (1060096)] (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) id 1qcMLA-0005vO-46; Sat, 02 Sep 2023 10:50:48 +0200 In-Reply-To: <87h6odhxs6.fsf@localhost> (Ihor Radchenko's message of "Sat, 02 Sep 2023 06:52:41 +0000") Received-SPF: pass client-ip=2a0c:5a00:149::25; envelope-from=devel@hugot.nl; helo=mailtransmit04.runbox.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:309861 Archived-At: Ihor Radchenko writes: > Yuan Fu writes: > >> In the months after wrapping up tree-sitter stuff in emacs-29, I was >> thinking about how to implement structural navigation and extracting >> information from the parser with tree-sitter. In emacs-29 we have >> things like treesit-beginning/end-of-defun, and treesit-defun-name. I >> was thinking maybe we can generalize this to support getting arbitrary >> =E2=80=9Cthing=E2=80=9D at point, move around them, and getting informat= ion like the >> name of a defun, its arglist, parent of a class, type of an variable >> declaration, etc, in a language-agnostic way. > > Note that Org mode also does all of these using > https://orgmode.org/worg/dev/org-element-api.html > > It would be nice if we could converge to more consistent interface > across all the modes. For example, by extending `thing-at-point' to handle > parsed elements, not just simplistic regexp-based "thing" boundaries > exposed by `thing-at-point' now. > > Org approaches getting name/begin/end/arguments using a common API: > > (org-element-property :begin NODE) > (org-element-property :end NODE) > (org-element-property :contents-begin NODE) > (org-element-property :contents-end NODE) > (org-element-property :name NODE) > (org-element-property :args NODE) > > Language-agnostic "thing"s will certainly be welcome, especially given > that tree-sitter grammars use inconsistent naming schemes, which have to > be learned separately, and may even change with grammar versions. > > I think that both NODE types and attributes can be standardized. > It would be great to see standardization that can work with more than just tree-sitter. Depending on how extensive such a generic NODE type and accompanying API are, I could see standardization of a lot of things that are currently implemented in major modes, to name a few: - indentation - fontification - thing-at-point - imenu - simple forms of completion (variables, function names in buffer) I have some idea of the underpinnings, but I have never implemented a full major mode so it is hard for me to judge the practicality of this. How much would be practical to standardize, without needlessly complicated/resource-heavy abstractions?