all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Yuan Fu <casouri@gmail.com>
To: emacs-devel <emacs-devel@gnu.org>
Cc: "Danny Freeman" <danny@dfreeman.email>,
	"Theodor Thornhill" <theo@thornhill.no>,
	"Jostein Kjønigsen" <jostein@secure.kjonigsen.net>,
	"Randy Taylor" <dev@rjt.dev>,
	"Wilhelm Kirschbaum" <wkirschbaum@gmail.com>,
	"Perry Smith" <pedz@easesoftware.com>,
	"Dmitry Gutov" <dgutov@yandex.ru>
Subject: Update on tree-sitter structure navigation
Date: Fri, 1 Sep 2023 22:01:54 -0700	[thread overview]
Message-ID: <5E7F2A94-4377-45C0-8541-7F59F3B54BA1@gmail.com> (raw)

Hey guys,

In the months after wrapping up tree-sitter stuff in emacs-29, I was thinking about how to implement structural navigation and extracting information from the parser with tree-sitter. In emacs-29 we have things like treesit-beginning/end-of-defun, and treesit-defun-name. I was thinking maybe we can generalize this to support getting arbitrary “thing” at point, move around them, and getting information like the name of a defun, its arglist,  parent of a class, type of an variable declaration, etc, in a language-agnostic way.

Also, at the time, we only support defining things by a regexp matching a node’s type, which is often not enough. 

And it would be nice to somehow take advantage of the tree-sitter queries for the features I mentioned above. Tree-sitter query is what every other editor are using for virtually all tree-sitter related features. But in Emacs, we mostly only use it for font-lock.

Here’s the progress as of now:

- Functions like treesit-search-forward, treesit-induce-sparse-tree, treesit-thing-at-point, treesit--navigate-thing, etc, support a richer set of predicates now. Besides regexp matching the type, the predicate can also be a predication function, or (REGEP . FUNC), or compound predicates like (or PRED PRED) or (not PRED).

- There’s now a variable treesit-thing-settings, which holds definition for things. Then, instead of passing the predicate to the functions I mentioned above, you can save the predicate in treesit-thing-settings under a symbol, say ‘sexp', and pass the symbol instead, just like thing-at-point.el. (We’ll work on integrating with thing-at-point.el later.)

- I can’t think of a good way to integrate tree-sitter queries with the navigation functions we have right now. Most importantly, tree-sitter query always search top-down, and you can’t limit the depth it searches. OTOH, our navigation functions work by traversing the tree node-to-node.

- There’s no progress on getting information like name and type, etc, in a language-agnostic way. I haven’t come up with a good interface and/or implementation. I encourage interested folks to give it some thought. Bonus points for reusing the query files neovim folks has accumulated :-)

Some other things on the TODO list that people can take a jab at:

- Query-based indentation (neovim’s implementation can be a source of inspiration)
- Improve c-ts-mode (indentation styles, other cc-mode features, etc) and other tree-sitter modes
- Solve the grammar versioning/breaking-change problem: tree-sitter grammar don’t have a version number, so every time the author changes the grammar, our queries break, and loading the mode only produces a giant error.
- Major mode fallback/inheritance, this has been discussed many times, no good solution emerged.
- Isolated ranges. For many embedded languages, each blocks should be independent from another, but currently all the embedded blocks are connected together and parsed by a single parser. We probably need to spawn a parser for each block. I’ll probably work on this one next.

Finally, feel free to send me an email or send to emacs-devel and CC me, if there are things treesit.c and treesit.el can do better, or when there are nice things in neovim and other editors and Emacs ought to have, too.

Yuan


             reply	other threads:[~2023-09-02  5:01 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-02  5:01 Yuan Fu [this message]
2023-09-02  6:52 ` Update on tree-sitter structure navigation Ihor Radchenko
2023-09-02  8:50   ` Hugo Thunnissen
2023-09-02 22:12     ` Yuan Fu
2023-09-06 11:37       ` Ihor Radchenko
2023-09-08  0:59         ` Yuan Fu
2023-09-02 22:09   ` Yuan Fu
2023-09-06 11:57     ` Ihor Radchenko
2023-09-06 12:58       ` Eli Zaretskii
2023-09-08 12:03         ` Ihor Radchenko
2023-09-08 13:08           ` Eli Zaretskii
2023-09-08  1:06       ` Yuan Fu
2023-09-08  9:09         ` Ihor Radchenko
2023-09-08 16:46           ` Yuan Fu
2023-09-03  0:56 ` Dmitry Gutov
2023-09-06  2:51   ` Danny Freeman
2023-09-06 12:47     ` Dmitry Gutov
2023-09-07  3:18       ` Danny Freeman
2023-09-07 12:52         ` Dmitry Gutov
2023-09-08  1:04   ` Yuan Fu
2023-09-08  6:40     ` Eli Zaretskii
2023-09-08 20:52       ` Dmitry Gutov
2023-09-09  6:32         ` Eli Zaretskii
2023-09-09 10:24           ` Dmitry Gutov
2023-09-09 11:38             ` Eli Zaretskii
2023-09-09 17:04               ` Dmitry Gutov
2023-09-09 17:28                 ` Eli Zaretskii
2023-09-12  0:36                   ` Yuan Fu
2023-09-12 10:17                     ` Dmitry Gutov
2023-09-08 21:05     ` Dmitry Gutov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5E7F2A94-4377-45C0-8541-7F59F3B54BA1@gmail.com \
    --to=casouri@gmail.com \
    --cc=danny@dfreeman.email \
    --cc=dev@rjt.dev \
    --cc=dgutov@yandex.ru \
    --cc=emacs-devel@gnu.org \
    --cc=jostein@secure.kjonigsen.net \
    --cc=pedz@easesoftware.com \
    --cc=theo@thornhill.no \
    --cc=wkirschbaum@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.