unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Update on tree-sitter structure navigation
@ 2023-09-02  5:01 Yuan Fu
  2023-09-02  6:52 ` Ihor Radchenko
  2023-09-03  0:56 ` Dmitry Gutov
  0 siblings, 2 replies; 30+ messages in thread
From: Yuan Fu @ 2023-09-02  5:01 UTC (permalink / raw)
  To: emacs-devel
  Cc: Danny Freeman, Theodor Thornhill, Jostein Kjønigsen,
	Randy Taylor, Wilhelm Kirschbaum, Perry Smith, Dmitry Gutov

Hey guys,

In the months after wrapping up tree-sitter stuff in emacs-29, I was thinking about how to implement structural navigation and extracting information from the parser with tree-sitter. In emacs-29 we have things like treesit-beginning/end-of-defun, and treesit-defun-name. I was thinking maybe we can generalize this to support getting arbitrary “thing” at point, move around them, and getting information like the name of a defun, its arglist,  parent of a class, type of an variable declaration, etc, in a language-agnostic way.

Also, at the time, we only support defining things by a regexp matching a node’s type, which is often not enough. 

And it would be nice to somehow take advantage of the tree-sitter queries for the features I mentioned above. Tree-sitter query is what every other editor are using for virtually all tree-sitter related features. But in Emacs, we mostly only use it for font-lock.

Here’s the progress as of now:

- Functions like treesit-search-forward, treesit-induce-sparse-tree, treesit-thing-at-point, treesit--navigate-thing, etc, support a richer set of predicates now. Besides regexp matching the type, the predicate can also be a predication function, or (REGEP . FUNC), or compound predicates like (or PRED PRED) or (not PRED).

- There’s now a variable treesit-thing-settings, which holds definition for things. Then, instead of passing the predicate to the functions I mentioned above, you can save the predicate in treesit-thing-settings under a symbol, say ‘sexp', and pass the symbol instead, just like thing-at-point.el. (We’ll work on integrating with thing-at-point.el later.)

- I can’t think of a good way to integrate tree-sitter queries with the navigation functions we have right now. Most importantly, tree-sitter query always search top-down, and you can’t limit the depth it searches. OTOH, our navigation functions work by traversing the tree node-to-node.

- There’s no progress on getting information like name and type, etc, in a language-agnostic way. I haven’t come up with a good interface and/or implementation. I encourage interested folks to give it some thought. Bonus points for reusing the query files neovim folks has accumulated :-)

Some other things on the TODO list that people can take a jab at:

- Query-based indentation (neovim’s implementation can be a source of inspiration)
- Improve c-ts-mode (indentation styles, other cc-mode features, etc) and other tree-sitter modes
- Solve the grammar versioning/breaking-change problem: tree-sitter grammar don’t have a version number, so every time the author changes the grammar, our queries break, and loading the mode only produces a giant error.
- Major mode fallback/inheritance, this has been discussed many times, no good solution emerged.
- Isolated ranges. For many embedded languages, each blocks should be independent from another, but currently all the embedded blocks are connected together and parsed by a single parser. We probably need to spawn a parser for each block. I’ll probably work on this one next.

Finally, feel free to send me an email or send to emacs-devel and CC me, if there are things treesit.c and treesit.el can do better, or when there are nice things in neovim and other editors and Emacs ought to have, too.

Yuan


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-09-12 10:17 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-02  5:01 Update on tree-sitter structure navigation Yuan Fu
2023-09-02  6:52 ` Ihor Radchenko
2023-09-02  8:50   ` Hugo Thunnissen
2023-09-02 22:12     ` Yuan Fu
2023-09-06 11:37       ` Ihor Radchenko
2023-09-08  0:59         ` Yuan Fu
2023-09-02 22:09   ` Yuan Fu
2023-09-06 11:57     ` Ihor Radchenko
2023-09-06 12:58       ` Eli Zaretskii
2023-09-08 12:03         ` Ihor Radchenko
2023-09-08 13:08           ` Eli Zaretskii
2023-09-08  1:06       ` Yuan Fu
2023-09-08  9:09         ` Ihor Radchenko
2023-09-08 16:46           ` Yuan Fu
2023-09-03  0:56 ` Dmitry Gutov
2023-09-06  2:51   ` Danny Freeman
2023-09-06 12:47     ` Dmitry Gutov
2023-09-07  3:18       ` Danny Freeman
2023-09-07 12:52         ` Dmitry Gutov
2023-09-08  1:04   ` Yuan Fu
2023-09-08  6:40     ` Eli Zaretskii
2023-09-08 20:52       ` Dmitry Gutov
2023-09-09  6:32         ` Eli Zaretskii
2023-09-09 10:24           ` Dmitry Gutov
2023-09-09 11:38             ` Eli Zaretskii
2023-09-09 17:04               ` Dmitry Gutov
2023-09-09 17:28                 ` Eli Zaretskii
2023-09-12  0:36                   ` Yuan Fu
2023-09-12 10:17                     ` Dmitry Gutov
2023-09-08 21:05     ` Dmitry Gutov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).