From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#64017: Wrong conversion from Emacs to Tree-sitter S-expression syntax Date: Mon, 12 Jun 2023 16:14:01 +0200 Message-ID: <43D49A55-2C3F-4EA4-8DF8-0CD9A516573E@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.15\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="40195"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Basil Contovounesios , Yuan Fu To: 64017@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Jun 12 16:15:25 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1q8iKK-000A9U-Jl for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 12 Jun 2023 16:15:24 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q8iK3-00032I-5z; Mon, 12 Jun 2023 10:15:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q8iJy-00030V-DI for bug-gnu-emacs@gnu.org; Mon, 12 Jun 2023 10:15:05 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q8iJy-000357-4T for bug-gnu-emacs@gnu.org; Mon, 12 Jun 2023 10:15:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1q8iJx-0004bS-Ul for bug-gnu-emacs@gnu.org; Mon, 12 Jun 2023 10:15:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 12 Jun 2023 14:15:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 64017 X-GNU-PR-Package: emacs X-Debbugs-Original-To: Emacs Bug Report Original-Received: via spool by submit@debbugs.gnu.org id=B.168657924917612 (code B ref -1); Mon, 12 Jun 2023 14:15:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 12 Jun 2023 14:14:09 +0000 Original-Received: from localhost ([127.0.0.1]:40028 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q8iJ6-0004a0-QR for submit@debbugs.gnu.org; Mon, 12 Jun 2023 10:14:09 -0400 Original-Received: from lists.gnu.org ([209.51.188.17]:56018) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q8iJ4-0004Zs-H2 for submit@debbugs.gnu.org; Mon, 12 Jun 2023 10:14:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q8iJ4-0002if-71 for bug-gnu-emacs@gnu.org; Mon, 12 Jun 2023 10:14:06 -0400 Original-Received: from mail-lj1-x22e.google.com ([2a00:1450:4864:20::22e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q8iJ2-0002sU-I0 for bug-gnu-emacs@gnu.org; Mon, 12 Jun 2023 10:14:05 -0400 Original-Received: by mail-lj1-x22e.google.com with SMTP id 38308e7fff4ca-2b1a86cdec6so50559171fa.3 for ; Mon, 12 Jun 2023 07:14:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686579242; x=1689171242; h=to:cc:date:message-id:subject:mime-version :content-transfer-encoding:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=NHnqk8E+0dCfY1PXTEjr7hYL2lXwhoV9ZoxE2bxwGyA=; b=VDGeU7d7EcX6NMDTeNwwt9B4xsD65uokBt21REqkxgBMOmQ6p97efFvDDhjlmTWMQr RTsy3CRw5aIOipfLodmdAznC0tWZ8QxPvxsQtAnYvixCY4/0wPZ7xnJW2V2cqNr2H9d7 Gscto3/fGoWuDlW2HDTI2TORHph4CRqbTFYKH+Jih3WI0rKVRiM9Jtp5L9W5WqAoC4rt Oty9OtKsLNVX7R/VFIsZEEw6JKOthBF+NyHKkeq6dAEiHic/sYln3yKrIxhVuhZDLs4B Emr36bhJvDaz0AA6auvSYk9+fokJ5jRTnxwNXQ5HT5XOFv7PagkGE18EVxuCE86qOHRQ MChA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686579242; x=1689171242; h=to:cc:date:message-id:subject:mime-version :content-transfer-encoding:from:sender:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NHnqk8E+0dCfY1PXTEjr7hYL2lXwhoV9ZoxE2bxwGyA=; b=koHdCr1LllkxhnnuKpwOZP2kZzyG7PxcRluQI5eIx4MJYqtCcbwRLBqppMHQX2JQ7B xjJTEv+Tx1VpU18PrmFakrcqe5n/ZZ27uGa8UoIS1wFcvNGc02UqaA4C4K6w9Pu88uiD uRLHnLw8lKLFkGUEkhz0Y0mfcXMiRSe/58jJLzTjOAoRKRwxf0Jf3C/wJX1T2NbQa/rx /YQ9872ATFlAnTOWwoRvgMnaLaUbFWQDbfxi3/0kH6fCSNsEMSiCuUvqYWHJV+edxABR vAVNNvqAiSNQvAmc8EsUKCs7jkGUjIyTnHWCZ46pl/Psh1C1TNvUce0DfB0nwHEfDcUX J7cA== X-Gm-Message-State: AC+VfDxcQpruVTmV4tdXmwqY4LojKZW05lHpdKSxVnAV5JfsHmz1VY1R oC/fpiRhE6vCKOhCksYW7i2BXvcDp0Q= X-Google-Smtp-Source: ACHHUZ70EySgx65YJYSOwmJV8Nk5YdgWJTLs6cbwWS9679TJsekiMcOcPOTn7ZorPVkITblO7QRHYA== X-Received: by 2002:a2e:894b:0:b0:2b1:bcf2:68a5 with SMTP id b11-20020a2e894b000000b002b1bcf268a5mr2680477ljk.8.1686579242070; Mon, 12 Jun 2023 07:14:02 -0700 (PDT) Original-Received: from smtpclient.apple (c188-150-165-235.bredband.tele2.se. [188.150.165.235]) by smtp.gmail.com with ESMTPSA id c8-20020a2ea1c8000000b002b1a737fd3fsm1774622ljm.99.2023.06.12.07.14.01 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Jun 2023 07:14:01 -0700 (PDT) X-Mailer: Apple Mail (2.3654.120.0.1.15) Received-SPF: pass client-ip=2a00:1450:4864:20::22e; envelope-from=mattias.engdegard@gmail.com; helo=mail-lj1-x22e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:263275 Archived-At: `treesit-pattern-expand` converts a query pattern into tree-sitter = S-expression syntax, as a string. The conversion mainly converts certain = keywords but the main problem is that it prints strings in Emacs syntax = which differs from that of tree-sitter. As a consequence, :match regexps cannot contain newlines: (treesit-query-capture 'java '(((identifier) @font-lock-constant-face (:match "hello\n" @font-lock-constant-face)))) signals a syntax error. As far as I can tell the tree-sitter string syntax allows for the escape = sequences: \n =3D LF \r =3D CR \t =3D TAB \0 =3D NUL (only a single 0 -- no octal escapes!) \X =3D the character X itself Unescape newlines result in a syntax error as seen in the example above. = NULs don't seem to go well either. At the very least, the conversion should avoid literal newlines and NULs = in the result (and probably CR and TAB). This cannot be done with a = straight prin1-to-string. (By the way, why is the conversion written in C? Was Lisp too slow?) Ideally we should not need to expose the tree-sitter s-exp query syntax = at all. Surely Emacs s-exps should be preferable in every case?