From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.bugs Subject: bug#64017: Wrong conversion from Emacs to Tree-sitter S-expression syntax Date: Thu, 15 Jun 2023 15:08:26 -0700 Message-ID: References: <43D49A55-2C3F-4EA4-8DF8-0CD9A516573E@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.600.7\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1473"; mail-complaints-to="usenet@ciao.gmane.io" Cc: contovob@tcd.ie, 64017@debbugs.gnu.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Jun 16 00:09:26 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1q9v9i-0000HW-E3 for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 16 Jun 2023 00:09:26 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q9v9M-0002QK-8d; Thu, 15 Jun 2023 18:09:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q9v9K-0002Pw-AE for bug-gnu-emacs@gnu.org; Thu, 15 Jun 2023 18:09:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q9v9K-0006Is-1S for bug-gnu-emacs@gnu.org; Thu, 15 Jun 2023 18:09:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1q9v9J-0003X2-TY for bug-gnu-emacs@gnu.org; Thu, 15 Jun 2023 18:09:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Yuan Fu Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 15 Jun 2023 22:09:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64017 X-GNU-PR-Package: emacs X-Debbugs-Original-Cc: Basil Contovounesios , Bug Report Emacs Original-Received: via spool by submit@debbugs.gnu.org id=B.168686692713556 (code B ref -1); Thu, 15 Jun 2023 22:09:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 15 Jun 2023 22:08:47 +0000 Original-Received: from localhost ([127.0.0.1]:48143 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q9v94-0003WZ-Pi for submit@debbugs.gnu.org; Thu, 15 Jun 2023 18:08:47 -0400 Original-Received: from lists.gnu.org ([209.51.188.17]:44916) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q9v93-0003WS-GK for submit@debbugs.gnu.org; Thu, 15 Jun 2023 18:08:45 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q9v93-0002Nn-Af for bug-gnu-emacs@gnu.org; Thu, 15 Jun 2023 18:08:45 -0400 Original-Received: from mail-pf1-x436.google.com ([2607:f8b0:4864:20::436]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q9v91-0006E7-L4 for bug-gnu-emacs@gnu.org; Thu, 15 Jun 2023 18:08:45 -0400 Original-Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-6664a9f0b10so177008b3a.0 for ; Thu, 15 Jun 2023 15:08:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686866919; x=1689458919; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3rq8hN65Qxacob1gHRHw82KcefMtjnX2aMfEsCIwTWA=; b=JKcFducIMD7QC7Ij8z4nNpfnlh75FlyvywZKOaqlfhlZzYnjlCMTOHM/lPPqzrs1FL D5jKBm3iYT5n+OVn+gKTdljX9tVMMep3sSfts+DaFt3+TlIeSNlG+ilKtKNwXrnMe8/F Aij885g5BHeuh/c3OLYIuq1S2v2DQs9Mucfy8JekjY5g93pIiStxWlas/Ggoal0VMdBc OZLjWeozkJsb9u2Uj4ERmgsqs1kw2VVpFasyb69oTmwwNr9dizLFuGxOYq5hkETVyDwm Y/2/7qeXi8FG1UeeroRB1+IZU/ZeFTljIEnzR8N0I5iFEjsmYvm6KTyJ7pYS1gfJq/JC 5HUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686866919; x=1689458919; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3rq8hN65Qxacob1gHRHw82KcefMtjnX2aMfEsCIwTWA=; b=eVR/cdZrE4b4AcuWCBZTlPDK+7MxSSF4AXN46n9w5x6IAYYdscIWc1GOjPMXPM/xvp fWip9grwyWtbOqBVGDEenNbW2BkuArsWV325rIVhwDT5KSqUnGid4CcHTl4PntEozRlV +CuuSnzzhrmM7cDo5yp/qKzRTUom9MTX14YTAdxFIbyxwOw2eI5F7Ra1HdqneDh73+Ms AodbjuXT7YvKCjteL6es0QD+WBT4akRtHPZFQVwbQG3xFZnsxOm9aKsaW5HC4+bASfjO ffEMOF+SYUFvUUp7BtTklUlFADCY8vEmorDttJnVt78AtC2MRR9n4lLuZ1o7ugxm2SCP wH6A== X-Gm-Message-State: AC+VfDycmKIP5krHYqzgnMp/tlZsazDwloWoWux9HYgMd0adrS9C7q1k An8UiyvzvBzTlRgCvc7y3E0= X-Google-Smtp-Source: ACHHUZ7L8xxOoERiWPAjWA+SZFkKGvqHXu8hqZUU387kOC0gED/qxd2ioCmqu+lcoA16UUGB0IBQuA== X-Received: by 2002:a05:6a00:1829:b0:64b:20cd:6d52 with SMTP id y41-20020a056a00182900b0064b20cd6d52mr366925pfa.14.1686866919112; Thu, 15 Jun 2023 15:08:39 -0700 (PDT) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id y23-20020aa78557000000b0063b8ddf77f7sm12360472pfn.211.2023.06.15.15.08.38 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Jun 2023 15:08:38 -0700 (PDT) In-Reply-To: <43D49A55-2C3F-4EA4-8DF8-0CD9A516573E@gmail.com> X-Mailer: Apple Mail (2.3731.600.7) Received-SPF: pass client-ip=2607:f8b0:4864:20::436; envelope-from=casouri@gmail.com; helo=mail-pf1-x436.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:263439 Archived-At: Thanks for catching this. > On Jun 12, 2023, at 7:14 AM, Mattias Engdeg=C3=A5rd = wrote: >=20 > `treesit-pattern-expand` converts a query pattern into tree-sitter = S-expression syntax, as a string. The conversion mainly converts certain = keywords but the main problem is that it prints strings in Emacs syntax = which differs from that of tree-sitter. >=20 > As a consequence, :match regexps cannot contain newlines: >=20 > (treesit-query-capture > 'java > '(((identifier) @font-lock-constant-face > (:match "hello\n" @font-lock-constant-face)))) >=20 > signals a syntax error. >=20 > As far as I can tell the tree-sitter string syntax allows for the = escape sequences: >=20 > \n =3D LF > \r =3D CR > \t =3D TAB > \0 =3D NUL (only a single 0 -- no octal escapes!) > \X =3D the character X itself >=20 > Unescape newlines result in a syntax error as seen in the example = above. NULs don't seem to go well either. >=20 > At the very least, the conversion should avoid literal newlines and = NULs in the result (and probably CR and TAB). This cannot be done with a = straight prin1-to-string. >=20 > (By the way, why is the conversion written in C? Was Lisp too slow?) Because I wasn't sure if it=E2=80=99s ok for C functions to rely on Lisp = functions, plus the function is simple enough. Right now if one = doesn=E2=80=99t load treesit.el, all the C functions work fine. >=20 > Ideally we should not need to expose the tree-sitter s-exp query = syntax at all. Surely Emacs s-exps should be preferable in every case? >=20 It shouldn=E2=80=99t hurt to expose the tree-sitter sexp. Other editors = mainly use the string syntax. Yuan=