From: "Basil L. Contovounesios" via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: 64019@debbugs.gnu.org
Cc: Yuan Fu <casouri@gmail.com>,
Theodor Thornhill <theo@thornhill.no>, Randy Taylor <dev@rjt.dev>,
Daniel Colascione <dancol@dancol.org>
Subject: bug#64019: 29.0.91; Fix some tree-sitter :match regexps
Date: Mon, 12 Jun 2023 15:25:38 +0100 [thread overview]
Message-ID: <87o7lkzrj1.fsf@epfl.ch> (raw)
[-- Attachment #1: Type: text/plain, Size: 549 bytes --]
Tags: patch
With help from modified versions of the xr and relint packages,
I noticed some suspicious regexps in the new tree-sitter modes:
- Using bol/eol anchors where matching is performed against the whole
node text
- Shy groups probably mistyped/copied as :? instead of ?:
- Identifiers defined as comprising [A-Z_\\d], where I assume \d was
meant to match digits, but instead matches '\' or 'd'
- Unnecessary numbered grouping
These all occur in new Emacs 29 features, so the patch is intended for
emacs-29.
WDYT?
Thanks,
--
Basil
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-some-tree-sitter-match-regexps.patch --]
[-- Type: text/x-diff, Size: 10831 bytes --]
From df7e575393c44976a4389641975f9e4aab7efb24 Mon Sep 17 00:00:00 2001
From: "Basil L. Contovounesios" <contovob@tcd.ie>
Date: Wed, 7 Jun 2023 12:26:25 +0100
Subject: [PATCH] Fix some tree-sitter :match regexps
Some of these issues were caught by modified versions of the
GNU ELPA packages xr and relint:
- https://github.com/mattiase/xr/pull/6
- https://github.com/mattiase/relint/pull/14
* lisp/progmodes/c-ts-mode.el (c-ts-mode--font-lock-settings)
(c-ts-mode--c-or-c++-regexp):
* lisp/progmodes/cmake-ts-mode.el
(cmake-ts-mode--font-lock-settings):
* lisp/progmodes/java-ts-mode.el (java-ts-mode--font-lock-settings):
* lisp/progmodes/js.el (js--plain-method-re):
(js--treesit-font-lock-settings):
* lisp/progmodes/python.el (python--treesit-settings):
* lisp/progmodes/rust-ts-mode.el (rust-ts-mode--font-lock-settings):
* lisp/progmodes/sh-script.el (sh-mode--treesit-settings):
* lisp/progmodes/typescript-ts-mode.el
(typescript-ts-mode--font-lock-settings):
* test/src/treesit-tests.el (treesit-query-api): Replace occurrences
of [\\d], which matches '\' or 'd', with the most likely intention
[0-9]. Anchor :match regexps at beginning/end of string, not line.
Fix shy groups mistyped as optional colon.
---
lisp/progmodes/c-ts-mode.el | 4 ++--
lisp/progmodes/cmake-ts-mode.el | 3 ++-
lisp/progmodes/java-ts-mode.el | 4 ++--
lisp/progmodes/js.el | 6 +++---
lisp/progmodes/python.el | 2 +-
lisp/progmodes/rust-ts-mode.el | 19 +++++++++++--------
lisp/progmodes/sh-script.el | 2 +-
lisp/progmodes/typescript-ts-mode.el | 4 ++--
test/src/treesit-tests.el | 4 ++--
9 files changed, 26 insertions(+), 22 deletions(-)
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index c6cb9520e58..4775cbd724d 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -701,7 +701,7 @@ c-ts-mode--font-lock-settings
`(((call_expression
(call_expression function: (identifier) @fn)
@c-ts-mode--fontify-DEFUN)
- (:match "^DEFUN$" @fn))
+ (:match "\\`DEFUN\\'" @fn))
((function_definition type: (_) @for-each-tail)
@c-ts-mode--fontify-for-each-tail
@@ -1319,7 +1319,7 @@ c-ts-mode--c-or-c++-regexp
"\\|" id "::"
"\\|" id ws-maybe "=\\)"
"\\|" "\\(?:inline" ws "\\)?namespace"
- "\\(:?" ws "\\(?:" id "::\\)*" id "\\)?" ws-maybe "{"
+ "\\(?:" ws "\\(?:" id "::\\)*" id "\\)?" ws-maybe "{"
"\\|" "class" ws id
"\\(?:" ws "final" "\\)?" ws-maybe "[:{;\n]"
"\\|" "struct" ws id "\\(?:" ws "final" ws-maybe "[:{\n]"
diff --git a/lisp/progmodes/cmake-ts-mode.el b/lisp/progmodes/cmake-ts-mode.el
index d83a956af21..9d35d8077bd 100644
--- a/lisp/progmodes/cmake-ts-mode.el
+++ b/lisp/progmodes/cmake-ts-mode.el
@@ -134,7 +134,8 @@ cmake-ts-mode--font-lock-settings
:language 'cmake
:feature 'number
'(((unquoted_argument) @font-lock-number-face
- (:match "^[[:digit:]]*\\.?[[:digit:]]*\\.?[[:digit:]]+$" @font-lock-number-face)))
+ (:match "\\`[[:digit:]]*\\.?[[:digit:]]*\\.?[[:digit:]]+\\'"
+ @font-lock-number-face)))
:language 'cmake
:feature 'string
diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
index 44dfd74cafd..7f2fc4188a3 100644
--- a/lisp/progmodes/java-ts-mode.el
+++ b/lisp/progmodes/java-ts-mode.el
@@ -168,7 +168,7 @@ java-ts-mode--font-lock-settings
:override t
:feature 'constant
`(((identifier) @font-lock-constant-face
- (:match "^[A-Z_][A-Z_\\d]*$" @font-lock-constant-face))
+ (:match "\\`[A-Z_][0-9A-Z_]*\\'" @font-lock-constant-face))
[(true) (false)] @font-lock-constant-face)
:language 'java
:override t
@@ -237,7 +237,7 @@ java-ts-mode--font-lock-settings
(scoped_identifier (identifier) @font-lock-constant-face)
((scoped_identifier name: (identifier) @font-lock-type-face)
- (:match "^[A-Z]" @font-lock-type-face))
+ (:match "\\`[A-Z]" @font-lock-type-face))
(type_identifier) @font-lock-type-face
diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
index 52ed19cc682..48fecf69537 100644
--- a/lisp/progmodes/js.el
+++ b/lisp/progmodes/js.el
@@ -106,7 +106,7 @@ js--opt-cpp-start
(defconst js--plain-method-re
(concat "^\\s-*?\\(" js--dotted-name-re "\\)\\.prototype"
- "\\.\\(" js--name-re "\\)\\s-*?=\\s-*?\\(\\(:?async[ \t\n]+\\)function\\)\\_>")
+ "\\.\\(" js--name-re "\\)\\s-*?=\\s-*?\\(\\(?:async[ \t\n]+\\)function\\)\\_>")
"Regexp matching an explicit JavaScript prototype \"method\" declaration.
Group 1 is a (possibly-dotted) class name, group 2 is a method name,
and group 3 is the `function' keyword.")
@@ -3493,7 +3493,7 @@ js--treesit-font-lock-settings
:language 'javascript
:feature 'constant
'(((identifier) @font-lock-constant-face
- (:match "^[A-Z_][A-Z_\\d]*$" @font-lock-constant-face))
+ (:match "\\`[A-Z_][0-9A-Z_]*\\'" @font-lock-constant-face))
[(true) (false) (null)] @font-lock-constant-face)
@@ -3612,7 +3612,7 @@ js--treesit-font-lock-settings
:feature 'number
'((number) @font-lock-number-face
((identifier) @font-lock-number-face
- (:match "^\\(:?NaN\\|Infinity\\)$" @font-lock-number-face)))
+ (:match "\\`\\(?:NaN\\|Infinity\\)\\'" @font-lock-number-face)))
:language 'javascript
:feature 'operator
diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el
index fd196df7550..d9ca37145e1 100644
--- a/lisp/progmodes/python.el
+++ b/lisp/progmodes/python.el
@@ -1106,7 +1106,7 @@ python--treesit-settings
:language 'python
`([,@python--treesit-keywords] @font-lock-keyword-face
((identifier) @font-lock-keyword-face
- (:match "^self$" @font-lock-keyword-face)))
+ (:match "\\`self\\'" @font-lock-keyword-face)))
:feature 'definition
:language 'python
diff --git a/lisp/progmodes/rust-ts-mode.el b/lisp/progmodes/rust-ts-mode.el
index c3cf8d0cf44..999c1d7ae96 100644
--- a/lisp/progmodes/rust-ts-mode.el
+++ b/lisp/progmodes/rust-ts-mode.el
@@ -143,7 +143,7 @@ rust-ts-mode--font-lock-settings
eol))
@font-lock-builtin-face)))
((identifier) @font-lock-type-face
- (:match "^\\(:?Err\\|Ok\\|None\\|Some\\)$" @font-lock-type-face)))
+ (:match "\\`\\(?:Err\\|Ok\\|None\\|Some\\)\\'" @font-lock-type-face)))
:language 'rust
:feature 'comment
@@ -212,11 +212,11 @@ rust-ts-mode--font-lock-settings
(scoped_use_list path: (scoped_identifier
name: (identifier) @font-lock-constant-face))
((use_as_clause alias: (identifier) @font-lock-type-face)
- (:match "^[A-Z]" @font-lock-type-face))
+ (:match "\\`[A-Z]" @font-lock-type-face))
((use_as_clause path: (identifier) @font-lock-type-face)
- (:match "^[A-Z]" @font-lock-type-face))
+ (:match "\\`[A-Z]" @font-lock-type-face))
((use_list (identifier) @font-lock-type-face)
- (:match "^[A-Z]" @font-lock-type-face))
+ (:match "\\`[A-Z]" @font-lock-type-face))
(use_wildcard [(identifier) @rust-ts-mode--fontify-scope
(scoped_identifier
name: (identifier) @rust-ts-mode--fontify-scope)])
@@ -232,9 +232,12 @@ rust-ts-mode--font-lock-settings
(type_identifier) @font-lock-type-face
((scoped_identifier name: (identifier) @rust-ts-mode--fontify-tail))
((scoped_identifier path: (identifier) @font-lock-type-face)
- (:match
- "^\\(u8\\|u16\\|u32\\|u64\\|u128\\|usize\\|i8\\|i16\\|i32\\|i64\\|i128\\|isize\\|char\\|str\\)$"
- @font-lock-type-face))
+ (:match ,(rx bos
+ (or "u8" "u16" "u32" "u64" "u128" "usize"
+ "i8" "i16" "i32" "i64" "i128" "isize"
+ "char" "str")
+ eos)
+ @font-lock-type-face))
((scoped_identifier path: (identifier) @rust-ts-mode--fontify-scope))
((scoped_type_identifier path: (identifier) @rust-ts-mode--fontify-scope))
(type_identifier) @font-lock-type-face)
@@ -249,7 +252,7 @@ rust-ts-mode--font-lock-settings
:feature 'constant
`((boolean_literal) @font-lock-constant-face
((identifier) @font-lock-constant-face
- (:match "^[A-Z][A-Z\\d_]*$" @font-lock-constant-face)))
+ (:match "\\`[A-Z][0-9A-Z_]*\\'" @font-lock-constant-face)))
:language 'rust
:feature 'variable
diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el
index 54da1e0468e..9bc1f4dcfdc 100644
--- a/lisp/progmodes/sh-script.el
+++ b/lisp/progmodes/sh-script.el
@@ -3363,7 +3363,7 @@ sh-mode--treesit-settings
:feature 'number
:language 'bash
`(((word) @font-lock-number-face
- (:match "^[0-9]+$" @font-lock-number-face)))
+ (:match "\\`[0-9]+\\'" @font-lock-number-face)))
:feature 'bracket
:language 'bash
diff --git a/lisp/progmodes/typescript-ts-mode.el b/lisp/progmodes/typescript-ts-mode.el
index 1c19a031878..68aefd90f92 100644
--- a/lisp/progmodes/typescript-ts-mode.el
+++ b/lisp/progmodes/typescript-ts-mode.el
@@ -153,7 +153,7 @@ typescript-ts-mode--font-lock-settings
:language language
:feature 'constant
`(((identifier) @font-lock-constant-face
- (:match "^[A-Z_][A-Z_\\d]*$" @font-lock-constant-face))
+ (:match "\\`[A-Z_][0-9A-Z_]*\\'" @font-lock-constant-face))
[(true) (false) (null)] @font-lock-constant-face)
:language language
@@ -311,7 +311,7 @@ typescript-ts-mode--font-lock-settings
:feature 'number
`((number) @font-lock-number-face
((identifier) @font-lock-number-face
- (:match "^\\(:?NaN\\|Infinity\\)$" @font-lock-number-face)))
+ (:match "\\`\\(?:NaN\\|Infinity\\)\\'" @font-lock-number-face)))
:language language
:feature 'operator
diff --git a/test/src/treesit-tests.el b/test/src/treesit-tests.el
index fef603840f9..69db37fc0b4 100644
--- a/test/src/treesit-tests.el
+++ b/test/src/treesit-tests.el
@@ -368,14 +368,14 @@ treesit-query-api
;; String query.
'("(string) @string
(pair key: (_) @keyword)
-((_) @bob (#match \"^B.b$\" @bob))
+((_) @bob (#match \"\\\\`B.b\\\\'\" @bob))
(number) @number
((number) @n3 (#equal \"3\" @n3))
((number) @n3p (#pred treesit--ert-pred-last-sibling @n3p))"
;; Sexp query.
((string) @string
(pair key: (_) @keyword)
- ((_) @bob (:match "^B.b$" @bob))
+ ((_) @bob (:match "\\`B.b\\'" @bob))
(number) @number
((number) @n3 (:equal "3" @n3))
((number) @n3p (:pred treesit--ert-pred-last-sibling
--
2.34.1
[-- Attachment #3: Type: text/plain, Size: 3258 bytes --]
In GNU Emacs 29.0.91 (build 1, x86_64-pc-linux-gnu, X toolkit, cairo
version 1.16.0, Xaw3d scroll bars) of 2023-06-12 built on blc
Repository revision: bdb0bc2b4e44a7d40369e10e3de825d58fe46825
Repository branch: wt/emacs-29
Windowing system distributor 'The X.Org Foundation', version 11.0.12101004
System Description: Ubuntu 22.04.2 LTS
Configured using:
'configure CC=gcc-12 'CFLAGS=-Og -ggdb3' --prefix=/home/bic/.local
--with-program-suffix=-29 --with-file-notification=yes --with-x
--with-x-toolkit=lucid'
Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES NOTIFY
INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF
TOOLKIT_SCROLL_BARS TREE_SITTER WEBP X11 XAW3D XDBE XIM XINPUT2 XPM
LUCID ZLIB
Important settings:
value of $LC_MONETARY: en_IE.UTF-8
value of $LC_NUMERIC: en_IE.UTF-8
value of $LC_TIME: en_IE.UTF-8
value of $LANG: en_GB.UTF-8
value of $XMODIFIERS: @im=ibus
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug message mailcap yank-media puny dired
dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068
epg-config gnus-util text-property-search time-date subr-x mm-decode
mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader
cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils rmc iso-transl tooltip cconv eldoc paren electric
uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dbusbind inotify lcms2
dynamic-setting system-font-setting font-render-setting cairo x-toolkit
xinput2 x multi-tty make-network-process emacs)
Memory information:
((conses 16 36709 7363)
(symbols 48 5149 0)
(strings 32 13887 1551)
(string-bytes 1 379631)
(vectors 16 9301)
(vector-slots 8 148632 9511)
(floats 8 23 25)
(intervals 56 248 0)
(buffers 984 10))
next reply other threads:[~2023-06-12 14:25 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-12 14:25 Basil L. Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2023-06-12 14:48 ` bug#64019: 29.0.91; Fix some tree-sitter :match regexps Mattias Engdegård
2023-06-12 15:10 ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-06-12 15:37 ` Eli Zaretskii
2023-06-12 20:22 ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-06-13 2:36 ` Eli Zaretskii
2023-06-13 13:51 ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-06-12 16:07 ` Mattias Engdegård
2023-06-12 21:39 ` Dmitry Gutov
2023-06-13 7:47 ` Mattias Engdegård
2023-06-13 17:06 ` Dmitry Gutov
2023-06-12 21:33 ` Dmitry Gutov
2023-06-13 2:37 ` Eli Zaretskii
2023-06-13 14:08 ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-06-13 17:06 ` Dmitry Gutov
2023-06-15 17:17 ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-06-17 2:00 ` Dmitry Gutov
2023-06-17 6:48 ` Andreas Schwab
2023-06-17 8:39 ` Mattias Engdegård
2023-06-17 12:21 ` Mattias Engdegård
2023-06-17 15:44 ` Dmitry Gutov
2023-06-17 16:13 ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-06-17 16:34 ` Eli Zaretskii
2023-06-17 16:56 ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-06-17 17:35 ` Eli Zaretskii
2023-06-17 19:54 ` Basil Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-07-30 12:46 ` Basil L. Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-07-30 12:59 ` Eli Zaretskii
2023-07-30 16:04 ` Basil L. Contovounesios via Bug reports for GNU Emacs, the Swiss army knife of text editors
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87o7lkzrj1.fsf@epfl.ch \
--to=bug-gnu-emacs@gnu.org \
--cc=64019@debbugs.gnu.org \
--cc=casouri@gmail.com \
--cc=contovob@tcd.ie \
--cc=dancol@dancol.org \
--cc=dev@rjt.dev \
--cc=theo@thornhill.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).