From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter integration on feature/tree-sitter Date: Wed, 18 May 2022 13:52:58 -0700 Message-ID: <6EF70929-5759-4F1A-B878-0C1660FB6831@gmail.com> References: <5bada349-2f43-4325-b696-70918584cd3d@email.android.com> <83mtfsuluo.fsf@gnu.org> <87sfpjhm33.fsf@thornhill.no> <83a6brufe5.fsf@gnu.org> <87pmkmhp8i.fsf@thornhill.no> <83v8ueuc7i.fsf@gnu.org> <73DE25BA-5EEF-4497-8F98-8C5F20853A61@gmail.com> <87v8uewfuq.fsf@thornhill.no> <87mtfkbt9n.fsf@thornhill.no> <1179E1EC-90EF-4989-BE1D-115498F77F60@gmail.com> <87k0ajygry.fsf@thornhill.no> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.80.82.1.1\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="11311"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , Stefan Monnier , Emacs Devel , Daniel Colascione To: Theodor Thornhill Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed May 18 22:53:48 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nrQfz-0002jD-Fo for ged-emacs-devel@m.gmane-mx.org; Wed, 18 May 2022 22:53:47 +0200 Original-Received: from localhost ([::1]:37560 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nrQfy-0002Sy-7b for ged-emacs-devel@m.gmane-mx.org; Wed, 18 May 2022 16:53:46 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:44076) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nrQfL-0001nU-Ro for emacs-devel@gnu.org; Wed, 18 May 2022 16:53:07 -0400 Original-Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]:34757) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nrQfK-0003K3-3n; Wed, 18 May 2022 16:53:07 -0400 Original-Received: by mail-pl1-x636.google.com with SMTP id n8so2918480plh.1; Wed, 18 May 2022 13:53:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=Vq6Q8UUijN8ijV0ckZJ+/dIosJpZQMnPQZdJh4/Q3Bc=; b=kHjpGTmuCDvhCSWnm87+2i/9EIF6S3zNvyRlMrKdPusfAaOK9VvWyJggUIH/u8Mydb AIaCpsdp8G7lCxewAmRbEV0gHmohVDFjVkGgV1uXHjRJ9nXkBfQZJ8DiBhtGCu7AUMhR fjme6lLkQNg3RdqNRpjcocLWKCKR/SHmGDaQ3jJmC+OoNeu9HJzTecyH+x3dE6u5ewP4 GQDZEx78Wb0T/NsD8X/ij27A+MET8dP4gYRorQjTcjW5tjcY9BAlwiFbpAzepjLcv6jV Zhw5XMcy2KQv+6ZJy8A9hH1ccH/7hhGplacXlGv1Ez70Jpnx29KyAt283BGnGSyWjqrp LPfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=Vq6Q8UUijN8ijV0ckZJ+/dIosJpZQMnPQZdJh4/Q3Bc=; b=EWOVYWwyTb7wLTL4o6O/kVuhkGrdz3tfWo5BXfA7fJv0+L9AwQh3SUwhbtuygupLXd 83ECz4nOVfgC0xkEaOFhl/lBsA4WpFBE1yqW57PmiRMctXQqFNMhXyG4B6e+cmnSiDkA JbxNKaH7DJyfwTdxE7wYxfri7wJDGqMCNqxEC/A4y5wHg7jf4kbLMpwQSZn6dQJD3yG6 MWIm/Nrag2sNKQIbaadEllDbbFCLP0mIyGMb+kho2LXd8oI5dAyqfVK2TFkLuyCuT+sb tsnVQjSH3yuyQCLmcQy3rO1qw7GR7YvnraESRJez3h+ghki6QttMK+t46b3qMHXQ9VPF mIGA== X-Gm-Message-State: AOAM532RzUoZGNw64WM2qKQhwOMqx1cTDdxbxCn+hm8M0pgJzrOR0HjP 4fZhDQaNmCWkxci6Azs5dYk= X-Google-Smtp-Source: ABdhPJzvQTxTJvpElpWst4MaGBbDeYh79SsSlY03W03vBY/C7PErtAqml+hTR70vvt75d4FjG/MF9A== X-Received: by 2002:a17:90a:a096:b0:1df:58d7:5b20 with SMTP id r22-20020a17090aa09600b001df58d75b20mr1324156pjp.212.1652907179407; Wed, 18 May 2022 13:52:59 -0700 (PDT) Original-Received: from smtpclient.apple ([2600:1700:2ec7:8c90:3dfc:642d:671:bff9]) by smtp.gmail.com with ESMTPSA id f186-20020a62dbc3000000b0050dc7628133sm2489343pfg.13.2022.05.18.13.52.58 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 May 2022 13:52:59 -0700 (PDT) In-Reply-To: <87k0ajygry.fsf@thornhill.no> X-Mailer: Apple Mail (2.3696.80.82.1.1) Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=casouri@gmail.com; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:289935 Archived-At: > On May 17, 2022, at 2:45 PM, Theodor Thornhill = wrote: >=20 > Yuan Fu writes: >=20 >>> On May 13, 2022, at 10:03 PM, Theodor Thornhill = wrote: >>>=20 >>>>=20 >>>> Now there is treesit-beginning/end-of-defun. You just need to set = treesit-defun-query and everything else come for free. I needed to = invent some heavy machinery for that, resulting in some new handy = functions: >>>>=20 >>>> - treesit-traverse-depth-first >>>> - treesit-traverse-breadth-first >>>> - treesit-traverse-forward-depth-first (maybe this should be named = simply treesit-traverse-forward?) >>>>=20 >>>> - treesit-search-forward >>>> - treesit-search-beginning >>>> They are untested & undocumented (in manual), so please play with = them and report problems :-) >>>>=20 >=20 > I've been testing the provided functionality for = beginning/end-of-defun, > and I have some thoughts I'd like to share. >=20 > For starters, let me just give some context. The implementation I've > used so far before the provided version looks something like > ``` > (defun typescript-mode-move-to-node (fn) > (when-let ((found-node > (treesit-parent-until > (treesit-node-at (point)) > (lambda (parent) > (treesit-query-capture > parent > typescript-mode--defun-query))))) > (goto-char (funcall fn found-node)))) >=20 > (defun typescript-mode-beginning-of-defun (&optional arg) > (typescript-mode-move-to-node #'treesit-node-start)) >=20 > (defun typescript-mode-end-of-defun (&optional arg) > (typescript-mode-move-to-node #'treesit-node-end)) > ``` >=20 > If this is given a query such as >=20 > ``` > (defvar typescript-mode--defun-query > "[(import_statement) > (function_declaration) > (type_alias_declaration) > (interface_declaration) > (lexical_declaration)] @defun") > ``` >=20 > we would traverse parentwise and locate a node on match. This > implementation is very fast, but has an issue - it will only match in > the parentwise path, so siblings will not be found. This makes my > function useful, but not general enough. The version provided in-tree > right now uses the depth first approach, which has two big problems - > performance and inconsistency. >=20 > Its docstring notes: > ``` > Traversing forward depth-first means, for a tree like the below > where NODE is marked 1, traverse as numbered: >=20 > 16 > | > 3--------4-----------8 > | | | > o--o-+--1 5--+--6 9---+-----12 > | | | | | | > o o 2 7 +-+-+ +--+--+ > | | | | | > 10 11 13 14 15 > ``` >=20 > This means that if we start at node 1, I'd expect us to navigate to = the > nodes 3 - 4 - 8 - 16, when repeatedly pressing the beginning-of-defun. > However, because we go depth first, we can end up landing at say, node > 14, which feels unnatural. This can happen for example in javascript = if > we add arrow_function to the nodes to match. If node 14 contains such = a > node, the traversing order would look like this: 3 - 4 - 8 - 14 - 16. > This feels odd, or at least differs from how normal emacs operates. = In > addition, when the search gets long, it can take up to a second on my > system to find the beginning of a defun, because of the amount of > traversing required by the depth first algorithm. >=20 > I have a suggestion for a solution that you may consider. >=20 > Either add a new defcustom 'treesit-depth-first-go-deep', or add a new > param to 'treesit-traverse-depth-first', like so: > ``` > (defun treesit-traverse-depth-first (node pred &optional step go-deep) > (if (funcall pred node) > node > (and go-deep > (cl-loop for child in (if (or (null step) (>=3D step 0)) > (treesit-node-children node) > (nreverse (treesit-node-children node))) > if (treesit-traverse-depth-first child pred step) > return child)))) > ``` >=20 > This way we can avoid traversing deep into the subtrees, which is a = slow > operation, _and_ makes for an inconsistent experience. Setting = go-deep > as nil makes the function really fast, and also keeps the benefit of > finding siblings. >=20 > Another option is to not provide a generic depth first algorithm, and > only go for siblings and parents, but we may want the depth first for > other things, such as a generic 'treesit-goto-thing' function. >=20 > What do you think? Thanks, I think it could be very useful. I can add an option for the = user of treesit-traverse-depth-first to control the depth it goes. And = same for treesit-traverse-forward-depth-first. A relative depth of 0 = could mean only traverse siblings and parent, nil means traverse all the = way, a positive number n means traverse n steps down. Yuan