From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Theodor Thornhill Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter integration on feature/tree-sitter Date: Tue, 17 May 2022 23:45:53 +0200 Message-ID: <87k0ajygry.fsf@thornhill.no> References: <5bada349-2f43-4325-b696-70918584cd3d@email.android.com> <83mtfsuluo.fsf@gnu.org> <87sfpjhm33.fsf@thornhill.no> <83a6brufe5.fsf@gnu.org> <87pmkmhp8i.fsf@thornhill.no> <83v8ueuc7i.fsf@gnu.org> <73DE25BA-5EEF-4497-8F98-8C5F20853A61@gmail.com> <87v8uewfuq.fsf@thornhill.no> <87mtfkbt9n.fsf@thornhill.no> <1179E1EC-90EF-4989-BE1D-115498F77F60@gmail.com> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="33987"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , Stefan Monnier , Emacs Devel , Daniel Colascione To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue May 17 23:46:59 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nr51u-0008eF-SO for ged-emacs-devel@m.gmane-mx.org; Tue, 17 May 2022 23:46:59 +0200 Original-Received: from localhost ([::1]:54022 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nr51t-0002mI-Fz for ged-emacs-devel@m.gmane-mx.org; Tue, 17 May 2022 17:46:57 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:38322) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nr512-00022S-0s for emacs-devel@gnu.org; Tue, 17 May 2022 17:46:05 -0400 Original-Received: from out0.migadu.com ([94.23.1.103]:37400) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nr50y-0001qs-4z; Tue, 17 May 2022 17:46:02 -0400 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=thornhill.no; s=key1; t=1652823955; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1uV3MvDp3++F2vAJc/gAb2DoMrsR3dYi7UR1R8jRbzc=; b=lNVsYmfKhLE+Zd3ufWVxcTzsZA9SgUP/H+xn/77M2rzUOa3+SVYAXZJ9LuOXHV1tCpj+Fd 2W2ztq9fglCNKp+j4Hglg1lC6RI7MBJ7QiAcsC1jG2FyQTfoftbBobDfa/bC1Q2OcsEdOP 1VphnE1rfjdn7dpbiSZgLDsmTIxDdVaatm4lxcjtXU910IyPn8O+OicGd/h45lcnufIkvT owA3p+PLpOK7k0SeD2Ab+Tg1BZV9Yj7Q28wsf6q+5zEovbng8N8mUvEbG/vC1z3DxepkC/ 2ZevdnWIKNAUifwW52bd/9MYPEFiAwDi7lVTS0S9aIpuxhT1TfA6A8x8kDp4Tw== In-Reply-To: <1179E1EC-90EF-4989-BE1D-115498F77F60@gmail.com> X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: thornhill.no Received-SPF: pass client-ip=94.23.1.103; envelope-from=theo@thornhill.no; helo=out0.migadu.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:289881 Archived-At: Yuan Fu writes: >> On May 13, 2022, at 10:03 PM, Theodor Thornhill wrote: >> >>> >>> Now there is treesit-beginning/end-of-defun. You just need to set treesit-defun-query and everything else come for free. I needed to invent some heavy machinery for that, resulting in some new handy functions: >>> >>> - treesit-traverse-depth-first >>> - treesit-traverse-breadth-first >>> - treesit-traverse-forward-depth-first (maybe this should be named simply treesit-traverse-forward?) >>> >>> - treesit-search-forward >>> - treesit-search-beginning >>> They are untested & undocumented (in manual), so please play with them and report problems :-) >>> I've been testing the provided functionality for beginning/end-of-defun, and I have some thoughts I'd like to share. For starters, let me just give some context. The implementation I've used so far before the provided version looks something like ``` (defun typescript-mode-move-to-node (fn) (when-let ((found-node (treesit-parent-until (treesit-node-at (point)) (lambda (parent) (treesit-query-capture parent typescript-mode--defun-query))))) (goto-char (funcall fn found-node)))) (defun typescript-mode-beginning-of-defun (&optional arg) (typescript-mode-move-to-node #'treesit-node-start)) (defun typescript-mode-end-of-defun (&optional arg) (typescript-mode-move-to-node #'treesit-node-end)) ``` If this is given a query such as ``` (defvar typescript-mode--defun-query "[(import_statement) (function_declaration) (type_alias_declaration) (interface_declaration) (lexical_declaration)] @defun") ``` we would traverse parentwise and locate a node on match. This implementation is very fast, but has an issue - it will only match in the parentwise path, so siblings will not be found. This makes my function useful, but not general enough. The version provided in-tree right now uses the depth first approach, which has two big problems - performance and inconsistency. Its docstring notes: ``` Traversing forward depth-first means, for a tree like the below where NODE is marked 1, traverse as numbered: 16 | 3--------4-----------8 | | | o--o-+--1 5--+--6 9---+-----12 | | | | | | o o 2 7 +-+-+ +--+--+ | | | | | 10 11 13 14 15 ``` This means that if we start at node 1, I'd expect us to navigate to the nodes 3 - 4 - 8 - 16, when repeatedly pressing the beginning-of-defun. However, because we go depth first, we can end up landing at say, node 14, which feels unnatural. This can happen for example in javascript if we add arrow_function to the nodes to match. If node 14 contains such a node, the traversing order would look like this: 3 - 4 - 8 - 14 - 16. This feels odd, or at least differs from how normal emacs operates. In addition, when the search gets long, it can take up to a second on my system to find the beginning of a defun, because of the amount of traversing required by the depth first algorithm. I have a suggestion for a solution that you may consider. Either add a new defcustom 'treesit-depth-first-go-deep', or add a new param to 'treesit-traverse-depth-first', like so: ``` (defun treesit-traverse-depth-first (node pred &optional step go-deep) (if (funcall pred node) node (and go-deep (cl-loop for child in (if (or (null step) (>= step 0)) (treesit-node-children node) (nreverse (treesit-node-children node))) if (treesit-traverse-depth-first child pred step) return child)))) ``` This way we can avoid traversing deep into the subtrees, which is a slow operation, _and_ makes for an inconsistent experience. Setting go-deep as nil makes the function really fast, and also keeps the benefit of finding siblings. Another option is to not provide a generic depth first algorithm, and only go for siblings and parents, but we may want the depth first for other things, such as a generic 'treesit-goto-thing' function. What do you think? Theodor