From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Status update of tree-sitter features Date: Wed, 28 Dec 2022 01:44:32 -0800 Message-ID: Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="2907"; mail-complaints-to="usenet@ciao.gmane.io" To: emacs-devel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Dec 28 10:45:29 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pAT04-0000bv-AT for ged-emacs-devel@m.gmane-mx.org; Wed, 28 Dec 2022 10:45:28 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pASzH-0006WV-5b; Wed, 28 Dec 2022 04:44:39 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pASzE-0006W7-Tz for emacs-devel@gnu.org; Wed, 28 Dec 2022 04:44:36 -0500 Original-Received: from mail-pl1-x631.google.com ([2607:f8b0:4864:20::631]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pASzD-0005qf-99 for emacs-devel@gnu.org; Wed, 28 Dec 2022 04:44:36 -0500 Original-Received: by mail-pl1-x631.google.com with SMTP id u7so15573782plq.11 for ; Wed, 28 Dec 2022 01:44:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:date:message-id:subject:mime-version:content-transfer-encoding :from:from:to:cc:subject:date:message-id:reply-to; bh=4fPhuD5xGzFLARCcBiXFMsVCsScmYbOH0gtjTb3NvbQ=; b=NPNHUUP5awDdEmrAJlU0KNrLt1mcSAWwiWke4x6zai730rwCaRDWmkwxpouyiVoNSM u+24HCj+91N5WEXxbSXk9uQZhkeAS4p2ZufXMrQuNtjv7e2hAjfXH5oNQWaQ9o5oVhTM jg7AYECoknLqgfZKUW4N1M31JYDPtCUOWO40FbNE/IkyOZ4ihJfARdtXouQUGduz3SPz Myphl03HtzVRIazuUQb5plgArdW0RoQhAHF47iVsQq9a5Y9OvF2JlmPxCvslXRGjs6mI wDSdwHvANIBBMTQuM9UmjMHZABnZt19BOrzJaZC48knSSkPDC4Mj9ziaq/ZQFxKqMkB9 +FYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:date:message-id:subject:mime-version:content-transfer-encoding :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4fPhuD5xGzFLARCcBiXFMsVCsScmYbOH0gtjTb3NvbQ=; b=7UsYJBQoyQhKR0A5ATaWQ2iCvM0CWKjcbXMR5pCKi8gJuCI7lWbbPXb8kBTWxbNuFd cn/R65IQxczYWIKpsADHLd1pSzmhwz7f/QX8yegRTIjy5SMFTUvK8QYZViRf31KFRiYf LJjYGeguZ/GS4G4mowziom+zbInjHbwAa9KRweSkxUhbBAnGoo/NTgXe/0vhOP4ygItZ vQ/+0rw+VU6rE5sGPSzCazPKE+OsIo9aBicGl1gCYOtG/HTUtz1l7yXTzeWFieQvwSZ9 2BlQbxbV9o0H6WyPRRuG9iPjgIfttyttlvd1ynF3xMAlXP4HgQ8OcCpgP3+/3T+/1Jhr LELw== X-Gm-Message-State: AFqh2kpCwqBwJxeUYyT7B9PON6poI1HqhJMCcBxs6Ns+z11ks+43Jh7G /Oj+yVwRrPavtWR/6PlawXtAj34KBp4= X-Google-Smtp-Source: AMrXdXu2p5FI0GrH1KitmiaVBa9X01utdzjWiy4nz2qTMR7jTyC6Du3UqapKzcyeAK8VdQmRt1Ehtg== X-Received: by 2002:a17:902:e054:b0:189:da57:be3d with SMTP id x20-20020a170902e05400b00189da57be3dmr23102078plx.51.1672220673408; Wed, 28 Dec 2022 01:44:33 -0800 (PST) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id a10-20020a1709027e4a00b0018913417ba2sm8052665pln.130.2022.12.28.01.44.32 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 28 Dec 2022 01:44:33 -0800 (PST) X-Mailer: Apple Mail (2.3696.120.41.1.1) Received-SPF: pass client-ip=2607:f8b0:4864:20::631; envelope-from=casouri@gmail.com; helo=mail-pl1-x631.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:301999 Archived-At: Hi, As the complete feature freeze approaching, this is probably the last = set of features added to Emacs 29. I stuffed them in just in time ;-) 1. There is a new predicate in the query language, #pred. It=E2=80=99s = like #equal and #match. Basically it allows you to filter the captured = node with an arbitrary function. Right now there are some queries in the = font-lock settings that matches a little more than what we actually = want. For example, for the property feature, we only want the =E2=80=9Cbb=E2= =80=9D in =E2=80=9Caa.bb=E2=80=9D, but not in =E2=80=9Caa.bb(cc)=E2=80=9D,= because the latter is a method, not property. The query usually matches = both. With this new predicate we can use a function to filter out the = methods. If we can ensure that every query only captures the intended nodes, the = font-lock queries can be reused for context extraction: using the query = for the variable feature, I can find all the variables in a given = region, etc. 2. We=E2=80=99ve had treesit-defun-type-regexp for a while, I recently = generalized the idea into =E2=80=9Cthings=E2=80=9D. Now you can use = treesit=E2=80=94things-around, treesit=E2=80=94navigate-thing, and = treesit=E2=80=94thing-at-point to find and navigate arbitrary = =E2=80=9Cthings=E2=80=9D. A =E2=80=9Cthing=E2=80=9D is defined by a = regexp that matches the node types, plus (optionally) a filter function. 3. Now there is imenu support. Major modes don=E2=80=99t need to define = their own imenu functions anymore, they just need to set = treesit-simple-imenu-settings. They also need to set = treesit-defun-name-function, which is a function that finds out the name = of a defun node. It is used by both imenu and add-log-entry. 4. C-like modes now have adequate indent and filling for block comments.=20= Lastly I want to remind everyone to update the font-lock settings for = your major mode to be more complaint to the standard list of features we = decided on. This is not a hard requirement and major modes are free to = extend upon it, but it=E2=80=99s nice to be consistent, especially among = built-in modes. Here is the list, for your reference. Among all the features, I think = assignment is =E2=80=9Cnice to have=E2=80=9D, it=E2=80=99s fine to leave = it out if there isn=E2=80=99t enough time. Same goes for key: it may or = may not apply to a language. Basic tokens: delimiter ,.; (delimit things) operator =3D=3D !=3D || (produces a value) bracket []{}() misc-punctuation constant true, false, null number keyword comment (includes doc-comments) string (includes chars and docstrings) string-interpolation f"text {variable}" escape-sequence "\n\t\\" function every function identifier variable every variable identifier type every type identifier property a.b <--- highlight b key { a: b, c: d } <--- highlight a, c error highlight parse error Abstract features: assignment: the LHS of an assignment (thing being assigned to), eg: a =3D b <--- highlight a a.b =3D c <--- highlight b a[1] =3D d <--- highlight a definition: the thing being defined, eg: int a(int b) { <--- highlight a return 0 } int a; <-- highlight a struct a { <--- highlight a int b; <--- highlight b } As for decoration levels, this is my suggestion: '(( comment definition) ( keyword string type) ( assignment builtin constant decorator escape-sequence key number property string-interpolation) ( bracket delimiter function misc-punctuation operator variable)) Yuan=20=