From: "Mattias Engdegård" <mattias.engdegard@gmail.com>
To: Yuan Fu <casouri@gmail.com>
Cc: contovob@tcd.ie, 64017@debbugs.gnu.org
Subject: bug#64017: Wrong conversion from Emacs to Tree-sitter S-expression syntax
Date: Fri, 16 Jun 2023 19:02:58 +0200 [thread overview]
Message-ID: <04C45D03-D49B-4DE4-AD26-2606C94AF260@gmail.com> (raw)
In-Reply-To: <0CBD145C-0A92-4258-A5F3-6FC616E89ED8@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 130 bytes --]
Here is a modification of the treesit manual to teach s-expressions first.
It's mostly a matter of straightforward substitution.
[-- Attachment #2: treesit-doc-sexp-patterns.diff --]
[-- Type: application/octet-stream, Size: 9427 bytes --]
diff --git a/doc/lispref/parsing.texi b/doc/lispref/parsing.texi
index b0824faaaa2..bd81ee3c535 100644
--- a/doc/lispref/parsing.texi
+++ b/doc/lispref/parsing.texi
@@ -1132,9 +1132,9 @@ Pattern Matching
@defun treesit-query-capture node query &optional beg end node-only
This function matches patterns in @var{query} within @var{node}.
-The argument @var{query} can be either a string, a s-expression, or a
-compiled query object. For now, we focus on the string syntax;
-s-expression syntax and compiled query are described at the end of the
+The argument @var{query} can be either a s-expression, a string, or a
+compiled query object. For now, we focus on the s-expression syntax;
+string syntax and compiled query are described at the end of the
section.
The argument @var{node} can also be a parser or a language symbol. A
@@ -1165,8 +1165,8 @@ Pattern Matching
@example
@group
(setq query
- "(binary_expression
- (number_literal) @@number-in-exp) @@biexp")
+ '((binary_expression
+ (number_literal) @@number-in-exp) @@biexp)
@end group
@end example
@@ -1187,8 +1187,8 @@ Pattern Matching
@example
@group
(setq query
- "(binary_expression) @@biexp
- (number_literal) @@number @@biexp")
+ '((binary_expression) @@biexp
+ (number_literal) @@number @@biexp)
@end group
@end example
@@ -1246,23 +1246,23 @@ Pattern Matching
@subheading Quantify node
@cindex quantify node, tree-sitter
-Tree-sitter recognizes quantification operators @samp{*}, @samp{+} and
-@samp{?}. Their meanings are the same as in regular expressions:
-@samp{*} matches the preceding pattern zero or more times, @samp{+}
-matches one or more times, and @samp{?} matches zero or one time.
+Tree-sitter recognizes quantification operators @samp{:*}, @samp{:+} and
+@samp{:?}. Their meanings are the same as in regular expressions:
+@samp{:*} matches the preceding pattern zero or more times, @samp{:+}
+matches one or more times, and @samp{:?} matches zero or one time.
For example, the following pattern matches @code{type_declaration}
nodes that has @emph{zero or more} @code{long} keyword.
@example
-(type_declaration "long"*) @@long-type
+(type_declaration "long" :*) @@long-type
@end example
The following pattern matches a type declaration that has zero or one
@code{long} keyword:
@example
-(type_declaration "long"?) @@long-type
+(type_declaration "long" :?) @@long-type
@end example
@subheading Grouping
@@ -1272,15 +1272,15 @@ Pattern Matching
express a comma separated list of identifiers, one could write
@example
-(identifier) ("," (identifier))*
+(identifier) ("," (identifier)) :*
@end example
@subheading Alternation
-Again, similar to regular expressions, we can express ``match anyone
-from this group of patterns'' in a pattern. The syntax is a list of
-patterns enclosed in square brackets. For example, to capture some
-keywords in C, the pattern would be
+Again, similar to regular expressions, we can express ``match any one
+from this group of patterns'' in a pattern. The syntax is a vector of
+patterns. For example, to capture some keywords in C, the pattern
+would be
@example
@group
@@ -1295,7 +1295,7 @@ Pattern Matching
@subheading Anchor
-The anchor operator @samp{.} can be used to enforce juxtaposition,
+The anchor operator @code{:anchor} can be used to enforce juxtaposition,
i.e., to enforce two things to be directly next to each other. The
two ``things'' can be two nodes, or a child and the end of its parent.
For example, to capture the first child, the last child, or two
@@ -1304,19 +1304,19 @@ Pattern Matching
@example
@group
;; Anchor the child with the end of its parent.
-(compound_expression (_) @@last-child .)
+(compound_expression (_) @@last-child :anchor)
@end group
@group
;; Anchor the child with the beginning of its parent.
-(compound_expression . (_) @@first-child)
+(compound_expression :anchor (_) @@first-child)
@end group
@group
;; Anchor two adjacent children.
(compound_expression
(_) @@prev-child
- .
+ :anchor
(_) @@next-child)
@end group
@end example
@@ -1332,8 +1332,8 @@ Pattern Matching
@example
@group
(
- (array . (_) @@first (_) @@last .)
- (#equal @@first @@last)
+ (array :anchor (_) @@first (_) @@last :anchor)
+ (:equal @@first @@last)
)
@end group
@end example
@@ -1341,22 +1341,23 @@ Pattern Matching
@noindent
tree-sitter only matches arrays where the first element equals to the
last element. To attach a predicate to a pattern, we need to group
-them together. A predicate always starts with a @samp{#}. Currently
-there are three predicates, @code{#equal}, @code{#match}, and
-@code{#pred}.
+them together. Currently
+there are three predicates, @code{:equal}, @code{:match}, and
+@code{:pred}.
-@deffn Predicate equal arg1 arg2
+@deffn Predicate :equal arg1 arg2
Matches if @var{arg1} equals to @var{arg2}. Arguments can be either
strings or capture names. Capture names represent the text that the
captured node spans in the buffer.
@end deffn
-@deffn Predicate match regexp capture-name
+@deffn Predicate :match regexp capture-name
Matches if the text that @var{capture-name}'s node spans in the buffer
-matches regular expression @var{regexp}. Matching is case-sensitive.
+matches regular expression @var{regexp}, given as a string literal.
+Matching is case-sensitive.
@end deffn
-@deffn Predicate pred fn &rest nodes
+@deffn Predicate :pred fn &rest nodes
Matches if function @var{fn} returns non-@code{nil} when passed each
node in @var{nodes} as arguments. The function runs with the current
buffer set to the buffer of node being queried.
@@ -1366,23 +1367,23 @@ Pattern Matching
the same pattern. Indeed, it makes little sense to refer to capture
names in other patterns.
-@heading S-expression patterns
+@heading String patterns
-@cindex tree-sitter patterns as sexps
-@cindex patterns, tree-sitter, in sexp form
-Besides strings, Emacs provides a s-expression based syntax for
-tree-sitter patterns. It largely resembles the string-based syntax.
-For example, the following query
+@cindex tree-sitter patterns as strings
+@cindex patterns, tree-sitter, in string form
+Besides s-expressions, Emacs allows the tree-sitter's native query
+syntax to be used by writing them as strings. It largely resembles
+the s-expression syntax. For example, the following query
@example
@group
(treesit-query-capture
- node "(addition_expression
- left: (_) @@left
- \"+\" @@plus-sign
- right: (_) @@right) @@addition
+ node '((addition_expression
+ left: (_) @@left
+ "+" @@plus-sign
+ right: (_) @@right) @@addition
- [\"return\" \"break\"] @@keyword")
+ ["return" "break"] @@keyword))
@end group
@end example
@@ -1392,52 +1393,52 @@ Pattern Matching
@example
@group
(treesit-query-capture
- node '((addition_expression
- left: (_) @@left
- "+" @@plus-sign
- right: (_) @@right) @@addition
+ node "(addition_expression
+ left: (_) @@left
+ \"+\" @@plus-sign
+ right: (_) @@right) @@addition
- ["return" "break"] @@keyword))
+ [\"return\" \"break\"] @@keyword")
@end group
@end example
-Most patterns can be written directly as strange but nevertheless
-valid s-expressions. Only a few of them needs modification:
+Most patterns can be written directly as s-expressions inside a string.
+Only a few of them need modification:
@itemize
@item
-Anchor @samp{.} is written as @code{:anchor}.
+Anchor @code{:anchor}. is written as @samp{.}
@item
-@samp{?} is written as @samp{:?}.
+@samp{:?} is written as @samp{?}.
@item
-@samp{*} is written as @samp{:*}.
+@samp{:*} is written as @samp{*}.
@item
-@samp{+} is written as @samp{:+}.
+@samp{:+} is written as @samp{+}.
@item
-@code{#equal} is written as @code{:equal}. In general, predicates
-change their @samp{#} to @samp{:}.
+@code{:equal} is written as @code{#equal}. In general, predicates
+change their @samp{:} to @samp{#}.
@end itemize
For example,
@example
@group
-"(
- (compound_expression . (_) @@first (_)* @@rest)
- (#match \"love\" @@first)
- )"
+'((
+ (compound_expression :anchor (_) @@first (_) :* @@rest)
+ (:match "love" @@first)
+ ))
@end group
@end example
@noindent
-is written in s-expression as
+is written in string form as
@example
@group
-'((
- (compound_expression :anchor (_) @@first (_) :* @@rest)
- (:match "love" @@first)
- ))
+"(
+ (compound_expression . (_) @@first (_)* @@rest)
+ (#match \"love\" @@first)
+ )"
@end group
@end example
@@ -1461,7 +1462,7 @@ Pattern Matching
@end defun
@defun treesit-query-language query
-This function return the language of @var{query}.
+This function returns the language of @var{query}.
@end defun
@defun treesit-query-expand query
@@ -1653,7 +1654,7 @@ Multiple Languages
(setq css-range
(treesit-query-range
'html
- "(style_element (raw_text) @@capture)"))
+ '((style_element (raw_text) @@capture))))
(treesit-parser-set-included-ranges css css-range)
@end group
@@ -1662,7 +1663,7 @@ Multiple Languages
(setq js-range
(treesit-query-range
'html
- "(script_element (raw_text) @@capture)"))
+ '((script_element (raw_text) @@capture))))
(treesit-parser-set-included-ranges js js-range)
@end group
@end example
next prev parent reply other threads:[~2023-06-16 17:02 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-12 14:14 bug#64017: Wrong conversion from Emacs to Tree-sitter S-expression syntax Mattias Engdegård
[not found] ` <handler.64017.B.168657924917612.ack@debbugs.gnu.org>
2023-06-15 10:45 ` Mattias Engdegård
2023-06-15 22:13 ` Yuan Fu
2023-06-15 22:08 ` Yuan Fu
2023-06-16 11:25 ` Mattias Engdegård
2023-06-16 17:02 ` Mattias Engdegård [this message]
2023-06-16 17:33 ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-06-17 10:47 ` Mattias Engdegård
2023-06-17 12:57 ` Eli Zaretskii
2023-06-17 13:30 ` Mattias Engdegård
2023-06-17 22:55 ` Yuan Fu
2023-06-18 8:47 ` Mattias Engdegård
2023-06-17 23:02 ` Yuan Fu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=04C45D03-D49B-4DE4-AD26-2606C94AF260@gmail.com \
--to=mattias.engdegard@gmail.com \
--cc=64017@debbugs.gnu.org \
--cc=casouri@gmail.com \
--cc=contovob@tcd.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).