From: Yuan Fu <casouri@gmail.com>
To: emacs-devel <emacs-devel@gnu.org>
Subject: Status update of tree-sitter features
Date: Wed, 28 Dec 2022 01:44:32 -0800 [thread overview]
Message-ID: <B1D1C414-3A23-454F-B0FD-3C84B5888251@gmail.com> (raw)
Hi,
As the complete feature freeze approaching, this is probably the last set of features added to Emacs 29. I stuffed them in just in time ;-)
1. There is a new predicate in the query language, #pred. It’s like #equal and #match. Basically it allows you to filter the captured node with an arbitrary function. Right now there are some queries in the font-lock settings that matches a little more than what we actually want. For example, for the property feature, we only want the “bb” in “aa.bb”, but not in “aa.bb(cc)”, because the latter is a method, not property. The query usually matches both. With this new predicate we can use a function to filter out the methods.
If we can ensure that every query only captures the intended nodes, the font-lock queries can be reused for context extraction: using the query for the variable feature, I can find all the variables in a given region, etc.
2. We’ve had treesit-defun-type-regexp for a while, I recently generalized the idea into “things”. Now you can use treesit—things-around, treesit—navigate-thing, and treesit—thing-at-point to find and navigate arbitrary “things”. A “thing” is defined by a regexp that matches the node types, plus (optionally) a filter function.
3. Now there is imenu support. Major modes don’t need to define their own imenu functions anymore, they just need to set treesit-simple-imenu-settings. They also need to set treesit-defun-name-function, which is a function that finds out the name of a defun node. It is used by both imenu and add-log-entry.
4. C-like modes now have adequate indent and filling for block comments.
Lastly I want to remind everyone to update the font-lock settings for your major mode to be more complaint to the standard list of features we decided on. This is not a hard requirement and major modes are free to extend upon it, but it’s nice to be consistent, especially among built-in modes.
Here is the list, for your reference. Among all the features, I think assignment is “nice to have”, it’s fine to leave it out if there isn’t enough time. Same goes for key: it may or may not apply to a language.
Basic tokens:
delimiter ,.; (delimit things)
operator == != || (produces a value)
bracket []{}()
misc-punctuation
constant true, false, null
number
keyword
comment (includes doc-comments)
string (includes chars and docstrings)
string-interpolation f"text {variable}"
escape-sequence "\n\t\\"
function every function identifier
variable every variable identifier
type every type identifier
property a.b <--- highlight b
key { a: b, c: d } <--- highlight a, c
error highlight parse error
Abstract features:
assignment: the LHS of an assignment (thing being assigned to), eg:
a = b <--- highlight a
a.b = c <--- highlight b
a[1] = d <--- highlight a
definition: the thing being defined, eg:
int a(int b) { <--- highlight a
return 0
}
int a; <-- highlight a
struct a { <--- highlight a
int b; <--- highlight b
}
As for decoration levels, this is my suggestion:
'(( comment definition)
( keyword string type)
( assignment builtin constant decorator
escape-sequence key number property string-interpolation)
( bracket delimiter function misc-punctuation operator variable))
Yuan
next reply other threads:[~2022-12-28 9:44 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-28 9:44 Yuan Fu [this message]
2022-12-28 15:40 ` Status update of tree-sitter features Mickey Petersen
2022-12-29 0:15 ` Yuan Fu
2022-12-28 23:27 ` Dmitry Gutov
2022-12-29 0:23 ` Yuan Fu
2022-12-29 0:34 ` Dmitry Gutov
2022-12-29 9:21 ` Yuan Fu
2022-12-29 16:38 ` Dmitry Gutov
2022-12-30 11:16 ` Yuan Fu
2022-12-30 23:41 ` Dmitry Gutov
2022-12-31 22:15 ` Yuan Fu
2022-12-29 3:28 ` Stefan Monnier
2022-12-29 9:23 ` Yuan Fu
2022-12-30 14:27 ` Jostein Kjønigsen
2022-12-30 15:37 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B1D1C414-3A23-454F-B0FD-3C84B5888251@gmail.com \
--to=casouri@gmail.com \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).