From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#64017: Wrong conversion from Emacs to Tree-sitter S-expression syntax Date: Sat, 17 Jun 2023 15:30:04 +0200 Message-ID: <67D3CF20-8641-4BF6-A102-037591B6E821@gmail.com> References: <43D49A55-2C3F-4EA4-8DF8-0CD9A516573E@gmail.com> <0CBD145C-0A92-4258-A5F3-6FC616E89ED8@gmail.com> <04C45D03-D49B-4DE4-AD26-2606C94AF260@gmail.com> <87r0qb1fdj.fsf@epfl.ch> <078740D5-8AA0-47B5-A34D-6E5C3E0C34B3@gmail.com> <83bkheqmal.fsf@gnu.org> Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.15\)) Content-Type: multipart/mixed; boundary="Apple-Mail=_2A1B46EA-512E-4800-91B8-8FDDF623A06A" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27583"; mail-complaints-to="usenet@ciao.gmane.io" Cc: contovob@tcd.ie, casouri@gmail.com, 64017@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Jun 17 15:31:23 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qAW1S-0006zD-K9 for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 17 Jun 2023 15:31:22 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qAW1A-0008KX-2Y; Sat, 17 Jun 2023 09:31:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qAW18-0008JW-DN for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2023 09:31:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qAW18-0004WH-3w for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2023 09:31:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qAW17-00031W-Up for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2023 09:31:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 17 Jun 2023 13:31:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64017 X-GNU-PR-Package: emacs Original-Received: via spool by 64017-submit@debbugs.gnu.org id=B64017.168700861711549 (code B ref 64017); Sat, 17 Jun 2023 13:31:01 +0000 Original-Received: (at 64017) by debbugs.gnu.org; 17 Jun 2023 13:30:17 +0000 Original-Received: from localhost ([127.0.0.1]:51062 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qAW0P-00030B-EN for submit@debbugs.gnu.org; Sat, 17 Jun 2023 09:30:17 -0400 Original-Received: from mail-lf1-f45.google.com ([209.85.167.45]:42429) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qAW0J-0002yo-QW for 64017@debbugs.gnu.org; Sat, 17 Jun 2023 09:30:15 -0400 Original-Received: by mail-lf1-f45.google.com with SMTP id 2adb3069b0e04-4f762b3227dso2186770e87.1 for <64017@debbugs.gnu.org>; Sat, 17 Jun 2023 06:30:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687008606; x=1689600606; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:sender:from:to:cc:subject:date:message-id:reply-to; bh=AsBJzrLmVLE8JtXyQ/Lgc8qMNsxScxX50mXl/3TQoaQ=; b=BSFA5W1l22j14445+4rM5WVPPjUq34LAZnEMEZWa9DIm6ygZdQYkH9XKRe9lfl3KmP gLXMg/5OYIkFSiMiU3MUIII7kMmVz9Wn5k7R20Hri/kse8AtOSK2rGFIUK4E6elAFo35 MKyyBPFPP2oaeaTWm6eB8l3ken6XkBZWx0c1HZDCBYvQNWy0U1sDEsCuMDNYWZyh+TVy 1VYIvrogJlANfVo0UAHXXhGDJg/JdPZcay9BFdj71isq5SoLIE/OjpSPLWH1aQoWJm/I 8T9s9CO+Mro7Nzq6N/9soe2HLis3s/kfe9FM0wOqpPrWQU0zIs7uHnHd2niqxySokNYZ MyzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687008606; x=1689600606; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=AsBJzrLmVLE8JtXyQ/Lgc8qMNsxScxX50mXl/3TQoaQ=; b=bhs9cHl9Bsj/TJNFPCEFO+Vz/YROISZe3I3ZukN2ombQlaReRZI/+z1iXCJsz/Hrp6 +fCCHb3cd2ltWBBG1SfKe2OLmh+2ITApZHyKjXlfEPtg46Y7TS/2RZQNDvGZxmxfEVlk A/CdFuOy7C2IIfRuGQNeC//NfsCUeS7ZFPG/YZJZvKIDrgvihyYpfJHmf14oGMFagWZ5 m/WpcCTQeuLDPeUqC5hd0JSdWe1azTQJRGuFJXEhXnQuidUNDZkkzTK3SAQoqMoHc2aY WRVLpuwK175i25KUbw4k+a4rtHbBt8JWd2T/L0YdBdfCja3D1WEb/BuKo9/2Xsv/LiQi iV5A== X-Gm-Message-State: AC+VfDxSpmmAzNhKDtQ58gAjj1VWSZRcWr2v0Sjpw9PVzp3DfKQJmXwX AYtZmyP4yaM+joP0WrKpSkI= X-Google-Smtp-Source: ACHHUZ7qYkvFrifBR355GrURHsQ0gB6NoEWoZD8LP0ftt6Rq91C4kk2HQQkkCtyOdtfoIZd/LP2c8w== X-Received: by 2002:a19:e341:0:b0:4f1:2ebf:536f with SMTP id c1-20020a19e341000000b004f12ebf536fmr1715660lfk.16.1687008605538; Sat, 17 Jun 2023 06:30:05 -0700 (PDT) Original-Received: from smtpclient.apple (c188-150-165-235.bredband.tele2.se. [188.150.165.235]) by smtp.gmail.com with ESMTPSA id o2-20020ac24942000000b004f740564139sm2625355lfi.167.2023.06.17.06.30.04 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 17 Jun 2023 06:30:04 -0700 (PDT) In-Reply-To: <83bkheqmal.fsf@gnu.org> X-Mailer: Apple Mail (2.3654.120.0.1.15) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:263539 Archived-At: --Apple-Mail=_2A1B46EA-512E-4800-91B8-8FDDF623A06A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii 17 juni 2023 kl. 14.57 skrev Eli Zaretskii : >> Will do, thank you. Since this is only about documentation, perhaps = it could be done in emacs-29? >> Eli, would that be acceptable? >=20 > If Yuan doesn't mind, yes. But I'd like to hear from Yuan that he is > okay with these changes. Attached are the changes rebased to emacs-29 (fixing mistakes found by = Basil). --Apple-Mail=_2A1B46EA-512E-4800-91B8-8FDDF623A06A Content-Disposition: attachment; filename=treesit-doc-sexp-patterns-em29.diff Content-Type: application/octet-stream; x-unix-mode=0644; name="treesit-doc-sexp-patterns-em29.diff" Content-Transfer-Encoding: 7bit diff --git a/doc/lispref/parsing.texi b/doc/lispref/parsing.texi index 3906ca0118a..9e1df07d25c 100644 --- a/doc/lispref/parsing.texi +++ b/doc/lispref/parsing.texi @@ -1084,9 +1084,9 @@ Pattern Matching @defun treesit-query-capture node query &optional beg end node-only This function matches patterns in @var{query} within @var{node}. The -argument @var{query} can be either a string, an s-expression, or a -compiled query object. For now, we focus on the string syntax; -s-expression syntax and compiled queries are described at the end of +argument @var{query} can be either an s-expression, a string, or a +compiled query object. For now, we focus on the s-expression syntax; +string syntax and compiled queries are described at the end of the section. The argument @var{node} can also be a parser or a language symbol. A @@ -1118,8 +1118,8 @@ Pattern Matching @example @group (setq query - "(binary_expression - (number_literal) @@number-in-exp) @@biexp") + '((binary_expression + (number_literal) @@number-in-exp) @@biexp) @end group @end example @@ -1140,8 +1140,8 @@ Pattern Matching @example @group (setq query - "(binary_expression) @@biexp - (number_literal) @@number @@biexp") + '((binary_expression) @@biexp + (number_literal) @@number @@biexp) @end group @end example @@ -1199,23 +1199,23 @@ Pattern Matching @subheading Quantify node @cindex quantify node, tree-sitter -Tree-sitter recognizes quantification operators @samp{*}, @samp{+}, -and @samp{?}. Their meanings are the same as in regular expressions: -@samp{*} matches the preceding pattern zero or more times, @samp{+} -matches one or more times, and @samp{?} matches zero or one times. +Tree-sitter recognizes quantification operators @samp{:*}, @samp{:+}, +and @samp{:?}. Their meanings are the same as in regular expressions: +@samp{:*} matches the preceding pattern zero or more times, @samp{:+} +matches one or more times, and @samp{:?} matches zero or one times. For example, the following pattern matches @code{type_declaration} nodes that have @emph{zero or more} @code{long} keywords. @example -(type_declaration "long"*) @@long-type +(type_declaration "long" :*) @@long-type @end example The following pattern matches a type declaration that may or may not have a @code{long} keyword: @example -(type_declaration "long"?) @@long-type +(type_declaration "long" :?) @@long-type @end example @subheading Grouping @@ -1225,15 +1225,14 @@ Pattern Matching express a comma-separated list of identifiers, one could write @example -(identifier) ("," (identifier))* +(identifier) ("," (identifier)) :* @end example @subheading Alternation Again, similar to regular expressions, we can express ``match any one -of these patterns'' in a pattern. The syntax is a list of patterns -enclosed in square brackets. For example, to capture some keywords in -C, the pattern would be +of these patterns'' in a pattern. The syntax is a vector of patterns. +For example, to capture some keywords in C, the pattern would be @example @group @@ -1248,7 +1247,7 @@ Pattern Matching @subheading Anchor -The anchor operator @samp{.} can be used to enforce juxtaposition, +The anchor operator @code{:anchor} can be used to enforce juxtaposition, i.e., to enforce two things to be directly next to each other. The two ``things'' can be two nodes, or a child and the end of its parent. For example, to capture the first child, the last child, or two @@ -1257,19 +1256,19 @@ Pattern Matching @example @group ;; Anchor the child with the end of its parent. -(compound_expression (_) @@last-child .) +(compound_expression (_) @@last-child :anchor) @end group @group ;; Anchor the child with the beginning of its parent. -(compound_expression . (_) @@first-child) +(compound_expression :anchor (_) @@first-child) @end group @group ;; Anchor two adjacent children. (compound_expression (_) @@prev-child - . + :anchor (_) @@next-child) @end group @end example @@ -1285,8 +1284,8 @@ Pattern Matching @example @group ( - (array . (_) @@first (_) @@last .) - (#equal @@first @@last) + (array :anchor (_) @@first (_) @@last :anchor) + (:equal @@first @@last) ) @end group @end example @@ -1294,22 +1293,22 @@ Pattern Matching @noindent tree-sitter only matches arrays where the first element is equal to the last element. To attach a predicate to a pattern, we need to -group them together. A predicate always starts with a @samp{#}. -Currently there are three predicates: @code{#equal}, @code{#match}, -and @code{#pred}. +group them together. Currently there are three predicates: +@code{:equal}, @code{:match}, and @code{:pred}. -@deffn Predicate equal arg1 arg2 +@deffn Predicate :equal arg1 arg2 Matches if @var{arg1} is equal to @var{arg2}. Arguments can be either strings or capture names. Capture names represent the text that the captured node spans in the buffer. @end deffn -@deffn Predicate match regexp capture-name +@deffn Predicate :match regexp capture-name Matches if the text that @var{capture-name}'s node spans in the buffer -matches regular expression @var{regexp}. Matching is case-sensitive. +matches regular expression @var{regexp}, given as a string literal. +Matching is case-sensitive. @end deffn -@deffn Predicate pred fn &rest nodes +@deffn Predicate :pred fn &rest nodes Matches if function @var{fn} returns non-@code{nil} when passed each node in @var{nodes} as arguments. @end deffn @@ -1318,23 +1317,23 @@ Pattern Matching the same pattern. Indeed, it makes little sense to refer to capture names in other patterns. -@heading S-expression patterns +@heading String patterns -@cindex tree-sitter patterns as sexps -@cindex patterns, tree-sitter, in sexp form -Besides strings, Emacs provides an s-expression based syntax for -tree-sitter patterns. It largely resembles the string-based syntax. -For example, the following query +@cindex tree-sitter patterns as strings +@cindex patterns, tree-sitter, in string form +Besides s-expressions, Emacs allows the tree-sitter's native query +syntax to be used by writing them as strings. It largely resembles +the s-expression syntax. For example, the following query @example @group (treesit-query-capture - node "(addition_expression - left: (_) @@left - \"+\" @@plus-sign - right: (_) @@right) @@addition + node '((addition_expression + left: (_) @@left + "+" @@plus-sign + right: (_) @@right) @@addition - [\"return\" \"break\"] @@keyword") + ["return" "break"] @@keyword)) @end group @end example @@ -1344,52 +1343,53 @@ Pattern Matching @example @group (treesit-query-capture - node '((addition_expression - left: (_) @@left - "+" @@plus-sign - right: (_) @@right) @@addition + node "(addition_expression + left: (_) @@left + \"+\" @@plus-sign + right: (_) @@right) @@addition - ["return" "break"] @@keyword)) + [\"return\" \"break\"] @@keyword") @end group @end example -Most patterns can be written directly as strange but nevertheless -valid s-expressions. Only a few of them need modification: +Most patterns can be written directly as s-expressions inside a string. +Only a few of them need modification: @itemize @item -Anchor @samp{.} is written as @code{:anchor}. +Anchor @code{:anchor} is written as @samp{.}. @item -@samp{?} is written as @samp{:?}. +@samp{:?} is written as @samp{?}. @item -@samp{*} is written as @samp{:*}. +@samp{:*} is written as @samp{*}. @item -@samp{+} is written as @samp{:+}. +@samp{:+} is written as @samp{+}. @item -@code{#equal} is written as @code{:equal}. In general, predicates -change their @samp{#} to @samp{:}. +@code{:equal}, @code{:match} and @code{:pred} are written as +@code{#equal}, @code{#match} and @code{#pred}, respectively. +In general, predicates change their @samp{:} to @samp{#}. @end itemize For example, @example @group -"( - (compound_expression . (_) @@first (_)* @@rest) - (#match \"love\" @@first) - )" +'(( + (compound_expression :anchor (_) @@first (_) :* @@rest) + (:match "love" @@first) + )) @end group @end example @noindent -is written in s-expression syntax as +is written in string form as @example @group -'(( - (compound_expression :anchor (_) @@first (_) :* @@rest) - (:match "love" @@first) - )) +"( + (compound_expression . (_) @@first (_)* @@rest) + (#match \"love\" @@first) + )" @end group @end example @@ -1413,7 +1413,7 @@ Pattern Matching @end defun @defun treesit-query-language query -This function return the language of @var{query}. +This function returns the language of @var{query}. @end defun @defun treesit-query-expand query @@ -1605,7 +1605,7 @@ Multiple Languages (setq css-range (treesit-query-range 'html - "(style_element (raw_text) @@capture)")) + '((style_element (raw_text) @@capture)))) (treesit-parser-set-included-ranges css css-range) @end group @@ -1614,7 +1614,7 @@ Multiple Languages (setq js-range (treesit-query-range 'html - "(script_element (raw_text) @@capture)")) + '((script_element (raw_text) @@capture)))) (treesit-parser-set-included-ranges js js-range) @end group @end example --Apple-Mail=_2A1B46EA-512E-4800-91B8-8FDDF623A06A--