unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattias.engdegard@gmail.com>
To: Yuan Fu <casouri@gmail.com>
Cc: contovob@tcd.ie, 64017@debbugs.gnu.org
Subject: bug#64017: Wrong conversion from Emacs to Tree-sitter S-expression syntax
Date: Fri, 16 Jun 2023 19:02:58 +0200	[thread overview]
Message-ID: <04C45D03-D49B-4DE4-AD26-2606C94AF260@gmail.com> (raw)
In-Reply-To: <0CBD145C-0A92-4258-A5F3-6FC616E89ED8@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 130 bytes --]

Here is a modification of the treesit manual to teach s-expressions first.
It's mostly a matter of straightforward substitution.


[-- Attachment #2: treesit-doc-sexp-patterns.diff --]
[-- Type: application/octet-stream, Size: 9427 bytes --]

diff --git a/doc/lispref/parsing.texi b/doc/lispref/parsing.texi
index b0824faaaa2..bd81ee3c535 100644
--- a/doc/lispref/parsing.texi
+++ b/doc/lispref/parsing.texi
@@ -1132,9 +1132,9 @@ Pattern Matching
 
 @defun treesit-query-capture node query &optional beg end node-only
 This function matches patterns in @var{query} within @var{node}.
-The argument @var{query} can be either a string, a s-expression, or a
-compiled query object.  For now, we focus on the string syntax;
-s-expression syntax and compiled query are described at the end of the
+The argument @var{query} can be either a s-expression, a string, or a
+compiled query object.  For now, we focus on the s-expression syntax;
+string syntax and compiled query are described at the end of the
 section.
 
 The argument @var{node} can also be a parser or a language symbol.  A
@@ -1165,8 +1165,8 @@ Pattern Matching
 @example
 @group
 (setq query
-      "(binary_expression
-        (number_literal) @@number-in-exp) @@biexp")
+      '((binary_expression
+         (number_literal) @@number-in-exp) @@biexp)
 @end group
 @end example
 
@@ -1187,8 +1187,8 @@ Pattern Matching
 @example
 @group
 (setq query
-      "(binary_expression) @@biexp
-       (number_literal)  @@number @@biexp")
+      '((binary_expression) @@biexp
+        (number_literal) @@number @@biexp)
 @end group
 @end example
 
@@ -1246,23 +1246,23 @@ Pattern Matching
 @subheading Quantify node
 
 @cindex quantify node, tree-sitter
-Tree-sitter recognizes quantification operators @samp{*}, @samp{+} and
-@samp{?}.  Their meanings are the same as in regular expressions:
-@samp{*} matches the preceding pattern zero or more times, @samp{+}
-matches one or more times, and @samp{?} matches zero or one time.
+Tree-sitter recognizes quantification operators @samp{:*}, @samp{:+} and
+@samp{:?}.  Their meanings are the same as in regular expressions:
+@samp{:*} matches the preceding pattern zero or more times, @samp{:+}
+matches one or more times, and @samp{:?} matches zero or one time.
 
 For example, the following pattern matches @code{type_declaration}
 nodes that has @emph{zero or more} @code{long} keyword.
 
 @example
-(type_declaration "long"*) @@long-type
+(type_declaration "long" :*) @@long-type
 @end example
 
 The following pattern matches a type declaration that has zero or one
 @code{long} keyword:
 
 @example
-(type_declaration "long"?) @@long-type
+(type_declaration "long" :?) @@long-type
 @end example
 
 @subheading Grouping
@@ -1272,15 +1272,15 @@ Pattern Matching
 express a comma separated list of identifiers, one could write
 
 @example
-(identifier) ("," (identifier))*
+(identifier) ("," (identifier)) :*
 @end example
 
 @subheading Alternation
 
-Again, similar to regular expressions, we can express ``match anyone
-from this group of patterns'' in a pattern.  The syntax is a list of
-patterns enclosed in square brackets.  For example, to capture some
-keywords in C, the pattern would be
+Again, similar to regular expressions, we can express ``match any one
+from this group of patterns'' in a pattern.  The syntax is a vector of
+patterns.  For example, to capture some keywords in C, the pattern
+would be
 
 @example
 @group
@@ -1295,7 +1295,7 @@ Pattern Matching
 
 @subheading Anchor
 
-The anchor operator @samp{.} can be used to enforce juxtaposition,
+The anchor operator @code{:anchor} can be used to enforce juxtaposition,
 i.e., to enforce two things to be directly next to each other.  The
 two ``things'' can be two nodes, or a child and the end of its parent.
 For example, to capture the first child, the last child, or two
@@ -1304,19 +1304,19 @@ Pattern Matching
 @example
 @group
 ;; Anchor the child with the end of its parent.
-(compound_expression (_) @@last-child .)
+(compound_expression (_) @@last-child :anchor)
 @end group
 
 @group
 ;; Anchor the child with the beginning of its parent.
-(compound_expression . (_) @@first-child)
+(compound_expression :anchor (_) @@first-child)
 @end group
 
 @group
 ;; Anchor two adjacent children.
 (compound_expression
  (_) @@prev-child
- .
+ :anchor
  (_) @@next-child)
 @end group
 @end example
@@ -1332,8 +1332,8 @@ Pattern Matching
 @example
 @group
 (
- (array . (_) @@first (_) @@last .)
- (#equal @@first @@last)
+ (array :anchor (_) @@first (_) @@last :anchor)
+ (:equal @@first @@last)
 )
 @end group
 @end example
@@ -1341,22 +1341,23 @@ Pattern Matching
 @noindent
 tree-sitter only matches arrays where the first element equals to the
 last element.  To attach a predicate to a pattern, we need to group
-them together.  A predicate always starts with a @samp{#}.  Currently
-there are three predicates, @code{#equal}, @code{#match}, and
-@code{#pred}.
+them together.  Currently
+there are three predicates, @code{:equal}, @code{:match}, and
+@code{:pred}.
 
-@deffn Predicate equal arg1 arg2
+@deffn Predicate :equal arg1 arg2
 Matches if @var{arg1} equals to @var{arg2}.  Arguments can be either
 strings or capture names.  Capture names represent the text that the
 captured node spans in the buffer.
 @end deffn
 
-@deffn Predicate match regexp capture-name
+@deffn Predicate :match regexp capture-name
 Matches if the text that @var{capture-name}'s node spans in the buffer
-matches regular expression @var{regexp}.  Matching is case-sensitive.
+matches regular expression @var{regexp}, given as a string literal.
+Matching is case-sensitive.
 @end deffn
 
-@deffn Predicate pred fn &rest nodes
+@deffn Predicate :pred fn &rest nodes
 Matches if function @var{fn} returns non-@code{nil} when passed each
 node in @var{nodes} as arguments.  The function runs with the current
 buffer set to the buffer of node being queried.
@@ -1366,23 +1367,23 @@ Pattern Matching
 the same pattern.  Indeed, it makes little sense to refer to capture
 names in other patterns.
 
-@heading S-expression patterns
+@heading String patterns
 
-@cindex tree-sitter patterns as sexps
-@cindex patterns, tree-sitter, in sexp form
-Besides strings, Emacs provides a s-expression based syntax for
-tree-sitter patterns.  It largely resembles the string-based syntax.
-For example, the following query
+@cindex tree-sitter patterns as strings
+@cindex patterns, tree-sitter, in string form
+Besides s-expressions, Emacs allows the tree-sitter's native query
+syntax to be used by writing them as strings.  It largely resembles
+the s-expression syntax.  For example, the following query
 
 @example
 @group
 (treesit-query-capture
- node "(addition_expression
-        left: (_) @@left
-        \"+\" @@plus-sign
-        right: (_) @@right) @@addition
+ node '((addition_expression
+         left: (_) @@left
+         "+" @@plus-sign
+         right: (_) @@right) @@addition
 
-        [\"return\" \"break\"] @@keyword")
+         ["return" "break"] @@keyword))
 @end group
 @end example
 
@@ -1392,52 +1393,52 @@ Pattern Matching
 @example
 @group
 (treesit-query-capture
- node '((addition_expression
-         left: (_) @@left
-         "+" @@plus-sign
-         right: (_) @@right) @@addition
+ node "(addition_expression
+        left: (_) @@left
+        \"+\" @@plus-sign
+        right: (_) @@right) @@addition
 
-         ["return" "break"] @@keyword))
+        [\"return\" \"break\"] @@keyword")
 @end group
 @end example
 
-Most patterns can be written directly as strange but nevertheless
-valid s-expressions.  Only a few of them needs modification:
+Most patterns can be written directly as s-expressions inside a string.
+Only a few of them need modification:
 
 @itemize
 @item
-Anchor @samp{.} is written as @code{:anchor}.
+Anchor @code{:anchor}. is written as @samp{.}
 @item
-@samp{?} is written as @samp{:?}.
+@samp{:?} is written as @samp{?}.
 @item
-@samp{*} is written as @samp{:*}.
+@samp{:*} is written as @samp{*}.
 @item
-@samp{+} is written as @samp{:+}.
+@samp{:+} is written as @samp{+}.
 @item
-@code{#equal} is written as @code{:equal}.  In general, predicates
-change their @samp{#} to @samp{:}.
+@code{:equal} is written as @code{#equal}.  In general, predicates
+change their @samp{:} to @samp{#}.
 @end itemize
 
 For example,
 
 @example
 @group
-"(
-  (compound_expression . (_) @@first (_)* @@rest)
-  (#match \"love\" @@first)
-  )"
+'((
+   (compound_expression :anchor (_) @@first (_) :* @@rest)
+   (:match "love" @@first)
+   ))
 @end group
 @end example
 
 @noindent
-is written in s-expression as
+is written in string form as
 
 @example
 @group
-'((
-   (compound_expression :anchor (_) @@first (_) :* @@rest)
-   (:match "love" @@first)
-   ))
+"(
+  (compound_expression . (_) @@first (_)* @@rest)
+  (#match \"love\" @@first)
+  )"
 @end group
 @end example
 
@@ -1461,7 +1462,7 @@ Pattern Matching
 @end defun
 
 @defun treesit-query-language query
-This function return the language of @var{query}.
+This function returns the language of @var{query}.
 @end defun
 
 @defun treesit-query-expand query
@@ -1653,7 +1654,7 @@ Multiple Languages
 (setq css-range
       (treesit-query-range
        'html
-       "(style_element (raw_text) @@capture)"))
+       '((style_element (raw_text) @@capture))))
 (treesit-parser-set-included-ranges css css-range)
 @end group
 
@@ -1662,7 +1663,7 @@ Multiple Languages
 (setq js-range
       (treesit-query-range
        'html
-       "(script_element (raw_text) @@capture)"))
+       '((script_element (raw_text) @@capture))))
 (treesit-parser-set-included-ranges js js-range)
 @end group
 @end example

  reply	other threads:[~2023-06-16 17:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-12 14:14 bug#64017: Wrong conversion from Emacs to Tree-sitter S-expression syntax Mattias Engdegård
     [not found] ` <handler.64017.B.168657924917612.ack@debbugs.gnu.org>
2023-06-15 10:45   ` Mattias Engdegård
2023-06-15 22:13     ` Yuan Fu
2023-06-15 22:08 ` Yuan Fu
2023-06-16 11:25   ` Mattias Engdegård
2023-06-16 17:02     ` Mattias Engdegård [this message]
2023-06-16 17:33       ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-06-17 10:47         ` Mattias Engdegård
2023-06-17 12:57           ` Eli Zaretskii
2023-06-17 13:30             ` Mattias Engdegård
2023-06-17 22:55               ` Yuan Fu
2023-06-18  8:47                 ` Mattias Engdegård
2023-06-17 23:02     ` Yuan Fu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04C45D03-D49B-4DE4-AD26-2606C94AF260@gmail.com \
    --to=mattias.engdegard@gmail.com \
    --cc=64017@debbugs.gnu.org \
    --cc=casouri@gmail.com \
    --cc=contovob@tcd.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).