unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Tree sitter support for C-like languages
@ 2022-11-10 17:45 Theodor Thornhill via Emacs development discussions.
  2022-11-10 18:03 ` Stefan Monnier
                   ` (3 more replies)
  0 siblings, 4 replies; 83+ messages in thread
From: Theodor Thornhill via Emacs development discussions. @ 2022-11-10 17:45 UTC (permalink / raw)
  To: emacs-devel; +Cc: casouri

[-- Attachment #1: Type: text/plain, Size: 500 bytes --]



Hi all,

See the attached patch for support for several C-like languages.

They all support:
- Font locking
- Indentation (with styles for c/c++)
- Movement
- Imenu
- Which-func

These modes are meant as a supplement to tree-sitter.

I'm hopeful for some constructive criticism, and some testing.  This
patch needs to be applied to the feature/tree-sitter branch, and should
hopefully be applied there before we merge the branch to master, well
before Emacs 29 is cut.

I hope you like it,

Theo



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-Tree-sitter-modes-for-C-like-languages.patch --]
[-- Type: text/x-diff, Size: 54077 bytes --]

From 5f00f6c0f28f9113aded33a968c3694e9e698f98 Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Thu, 10 Nov 2022 17:15:49 +0100
Subject: [PATCH] Add Tree sitter modes for C-like languages

* etc/NEWS: Mention the new modes
* lisp/progmodes/c++-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/c-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/java-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/json-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/css-ts-mode.el: New major mode with Tree Sitter support.
---
 etc/NEWS                       |  28 ++-
 lisp/progmodes/c++-ts-mode.el  | 407 +++++++++++++++++++++++++++++++++
 lisp/progmodes/c-ts-mode.el    | 396 ++++++++++++++++++++++++++++++++
 lisp/progmodes/css-ts-mode.el  | 122 ++++++++++
 lisp/progmodes/java-ts-mode.el | 282 +++++++++++++++++++++++
 lisp/progmodes/json-ts-mode.el | 141 ++++++++++++
 6 files changed, 1374 insertions(+), 2 deletions(-)
 create mode 100644 lisp/progmodes/c++-ts-mode.el
 create mode 100644 lisp/progmodes/c-ts-mode.el
 create mode 100644 lisp/progmodes/css-ts-mode.el
 create mode 100644 lisp/progmodes/java-ts-mode.el
 create mode 100644 lisp/progmodes/json-ts-mode.el

diff --git a/etc/NEWS b/etc/NEWS
index 9ed78bc6b3..3ce9810ece 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -2786,8 +2786,32 @@ when visiting JSON files.
 ** New mode ts-mode'.
 Support is added for TypeScript, based on the new integration with
 Tree-Sitter. There's support for font-locking, indentation and
-navigation.  Tree-Sitter is required for this mode to function, but if
-it is not available, we will default to use 'js-mode'.
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode c-ts-mode'.
+Support is added for C, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode c++-ts-mode'.
+Support is added for c++, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode java-ts-mode'.
+Support is added for Java, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode css-ts-mode'.
+Support is added for CSS, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode json-ts-mode'.
+Support is added for JSON, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
 
 \f
 * Incompatible Lisp Changes in Emacs 29.1
diff --git a/lisp/progmodes/c++-ts-mode.el b/lisp/progmodes/c++-ts-mode.el
new file mode 100644
index 0000000000..09457ec368
--- /dev/null
+++ b/lisp/progmodes/c++-ts-mode.el
@@ -0,0 +1,407 @@
+;;; c++-ts-mode.el --- tree sitter support for C++  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : c++ languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom c++-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `c++-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'cpp)
+
+
+(defcustom c++-ts-mode-indent-style 'gnu
+  "Style used for indentation.
+
+The selected style could be one of GNU, K&R, LINUX or BSD.  If
+one of the supplied styles doesn't suffice a function could be
+set instead.  This function is expected return a list that
+follows the form of `treesit-simple-indent-rules'."
+  :type '(choice (symbol :tag "Gnu" 'gnu)
+                 (symbol :tag "K&R" 'k&r)
+                 (symbol :tag "Linux" 'linux)
+                 (symbol :tag "BSD" 'bsd)
+                 (function :tag "A function for user customized style" ignore))
+  :group 'cpp)
+
+(defvar c++-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    (modify-syntax-entry ?/  ". 124b" table)
+    (modify-syntax-entry ?*  ". 23"   table)
+    table)
+  "Syntax table for `c++-ts-mode'.")
+
+(defvar c++-ts-mode--indent-styles
+  (let ((common
+         `(((parent-is "translation_unit") parent-bol 0)
+           ((node-is ")") parent 1)
+           ((node-is "]") parent-bol 0)
+           ((node-is "else") parent-bol 0)
+           ((node-is "case") parent-bol 0)
+           ((node-is "comment") no-indent)
+           ((parent-is "comment") no-indent)
+           ((node-is "labeled_statement") parent-bol 0)
+           ((parent-is "labeled_statement") parent-bol c++-ts-mode-indent-offset)
+           ((match "preproc_ifdef" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_ifdef") point-min 0)
+           ((match "preproc_if" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_if") point-min 0)
+           ((match "preproc_function_def" "compound_statement") point-min 0)
+           ((match "preproc_call" "compound_statement") point-min 0)
+           ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+           ((node-is "field_initializer_list") parent-bol ,(* c++-ts-mode-indent-offset 2))
+           ((parent-is "function_definition") parent-bol 0)
+           ((parent-is "conditional_expression") first-sibling 0)
+           ((parent-is "assignment_expression") parent-bol c++-ts-mode-indent-offset)
+           ((parent-is "comma_expression") first-sibling 0)
+           ((parent-is "init_declarator") parent-bol c++-ts-mode-indent-offset)
+           ((parent-is "parenthesized_expression") first-sibling 1)
+           ((parent-is "argument_list") first-sibling 1)
+           ((parent-is "parameter_list") first-sibling 1)
+           ((parent-is "binary_expression") parent 0)
+           ((query "(for_statement initializer: (_) @indent)") parent-bol 5)
+           ((query "(for_statement condition: (_) @indent)") parent-bol 5)
+           ((query "(for_statement update: (_) @indent)") parent-bol 5)
+           ((query "(call_expression arguments: (_) @indent)") parent c++-ts-mode-indent-offset)
+           ((parent-is "call_expression") parent 0)
+           ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+           ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset))))
+    `((gnu
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "compound_statement") parent c++-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((match "while" "do_statement") parent 0)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (k&r
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "compound_statement") grand-parent c++-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (linux
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "compound_statement") grand-parent c++-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (bsd
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "compound_statement") parent 0)
+       ((parent-is "if_statement") parent-bol 0)
+       ((parent-is "for_statement") parent-bol 0)
+       ((parent-is "while_statement") parent-bol 0)
+       ((parent-is "switch_statement") parent-bol 0)
+       ((parent-is "case_statement") parent-bol 0)
+       ((parent-is "do_statement") parent-bol 0)
+       (no-node parent-bol c++-ts-mode-indent-offset))))
+  "Indent rules supported by `c++-ts-mode'.")
+
+(defun c++-ts-mode--set-indent-style ()
+  "Helper function to set indentation style."
+  (let ((style
+         (if (functionp c++-ts-mode-indent-style)
+             (funcall c++-ts-mode-indent-style)
+           (pcase c++-ts-mode-indent-style
+             ('gnu   (alist-get 'gnu c++-ts-mode--indent-styles))
+             ('k&r   (alist-get 'k&r c++-ts-mode--indent-styles))
+             ('bsd   (alist-get 'bsd c++-ts-mode--indent-styles))
+             ('linux (alist-get 'linux c++-ts-mode--indent-styles))))))
+    `((cpp ,@style))))
+
+(defvar c++-ts-mode--keywords
+  '("and" "and_eq" "bitand" "bitor" "catch" "class" "co_await"
+    "co_return" "co_yield" "compl" "concept" "consteval" "constexpr"
+    "constinit" "decltype" "delete" "else" "explicit" "final" "for"
+    "friend" "friend" "if" "mutable" "namespace" "new" "noexcept"
+    "not" "not_eq" "operator" "or" "or_eq" "override" "private"
+    "protected" "public" "requires" "return" "static" "struct"
+    "template" "throw" "try" "typename" "using" "virtual" "xor" "xor_eq"
+    "switch" "case")
+  "C++ keywords for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--preproc-keywords
+  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif" "#include")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--operators
+  '("=" "-" "*" "/" "+" "%" "~" "|" "&" "^" "<<" ">>" "->"
+    "." "<" "<=" ">=" ">" "==" "!=" "!" "&&" "||" "-="
+    "+=" "*=" "/=" "%=" "|=" "&=" "^=" ">>=" "<<=" "--" "++")
+  "C operators for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'cpp
+   :override t
+   :feature 'comment
+   `((comment) @font-lock-comment-face)
+   :language 'cpp
+   :override t
+   :feature 'preprocessor
+   `((preproc_directive) @font-lock-preprocessor-face
+
+     (preproc_def
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_ifdef
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_function_def
+      name: (identifier) @font-lock-function-name-face)
+
+     (preproc_params
+      (identifier) @font-lock-variable-name-face)
+
+     (preproc_defined) @font-lock-preprocessor-face
+     (preproc_defined (identifier) @font-lock-variable-name-face)
+     [,@c++-ts-mode--preproc-keywords] @font-lock-preprocessor-face
+     )
+   :language 'cpp
+   :override t
+   :feature 'constant
+   `((true) @font-lock-constant-face
+     (false) @font-lock-constant-face
+     (null) @font-lock-constant-face
+     (this) @font-lock-constant-face)
+   :language 'cpp
+   :override t
+   :feature 'keyword
+   `([,@c++-ts-mode--keywords] @font-lock-keyword-face
+     (auto) @font-lock-keyword-face)
+   :language 'cpp
+   :override t
+   :feature 'operator
+   `([,@c++-ts-mode--operators] @font-lock-builtin-face)
+   :language 'cpp
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face
+     (system_lib_string) @font-lock-string-face
+     (escape_sequence) @font-lock-string-face)
+   :language 'cpp
+   :override t
+   :feature 'literal
+   `((number_literal) @font-lock-constant-face
+     (char_literal) @font-lock-constant-face)
+   :language 'cpp
+   :override t
+   :feature 'type
+   '((primitive_type) @font-lock-type-face
+     (type_qualifier) @font-lock-type-face
+
+     (qualified_identifier
+      scope: (namespace_identifier) @font-lock-type-face)
+
+     (operator_cast)  type: (type_identifier) @font-lock-type-face)
+   :language 'cpp
+   :override t
+   :feature 'definition
+   `((declaration
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (field_declaration
+      declarator: (field_identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_definition
+      type: (type_identifier) @font-lock-function-name-face)
+
+     (function_declarator
+      declarator: (identifier) @font-lock-function-name-face)
+
+     (array_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (init_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (struct_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (sized_type_specifier) @font-lock-type-face
+
+     (enum_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (enumerator
+      name: (identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (_) @font-lock-type-face
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (field_identifier) @font-lock-variable-name-face))
+   :language 'cpp
+   :override t
+   :feature 'expression
+   '((assignment_expression
+      left: (identifier) @font-lock-variable-name-face)
+
+     (call_expression
+      function: (identifier) @font-lock-function-name-face)
+
+     (field_expression
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (field_expression
+      argument: (identifier) @font-lock-variable-name-face
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (pointer_expression
+      argument: (identifier) @font-lock-variable-name-face))
+   :language 'cpp
+   :override t
+   :feature 'statement
+   '((expression_statement (identifier) @font-lock-variable-name-face)
+     (labeled_statement
+      label: (statement_identifier) @font-lock-type-face))
+   :language 'cpp
+   :override t
+   :feature 'error
+   '((ERROR) @font-lock-warning-face))
+  "Tree-sitter font-lock settings.")
+
+(defun c++-ts-mode--imenu-1 (node)
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'c++-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "declarator")
+                          (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun c++-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "function_definition"
+                             "struct_specifier")))))
+    (c++-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(add-to-list 'auto-mode-alist '("\\.hpp\\'" . c++-ts-mode))
+
+;;;###autoload
+(add-to-list 'auto-mode-alist '("\\.cpp\\'" . c++-ts-mode))
+
+;;;###autoload
+(define-derived-mode c++-ts-mode prog-mode "C++"
+  "Major mode for editing C++, powered by Tree Sitter."
+  :group 'cpp
+  :syntax-table c++-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'cpp)
+    (error "Tree Sitter for C++ isn't "))
+
+  (treesit-parser-create 'cpp)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (when (eq c++-ts-mode-indent-style 'linux)
+    (setq-local indent-tabs-mode t))
+  (setq-local treesit-simple-indent-rules
+              (c++-ts-mode--set-indent-style))
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "specifier"
+                      "definition")))
+
+  ;; Electric
+  (setq-local electric-indent-chars
+	      (append "{}():;," electric-indent-chars))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings c++-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((comment preprocessor operator constant string literal keyword)
+                (type definition expression statement)
+                (error)))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'c++-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'c++-ts-mode)
+
+;;; c++-ts-mode.el ends here
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
new file mode 100644
index 0000000000..9da20485af
--- /dev/null
+++ b/lisp/progmodes/c-ts-mode.el
@@ -0,0 +1,396 @@
+;;; c-ts-mode.el --- tree sitter support for C  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : c languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+(require 'pcase)
+
+(defcustom c-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `c-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'c)
+
+(defcustom c-ts-mode-indent-style 'gnu
+  "Style used for indentation.
+
+The selected style could be one of GNU, K&R, LINUX or BSD.  If
+one of the supplied styles doesn't suffice a function could be
+set instead.  This function is expected return a list that
+follows the form of `treesit-simple-indent-rules'."
+  :type '(choice (symbol :tag "Gnu" 'gnu)
+                 (symbol :tag "K&R" 'k&r)
+                 (symbol :tag "Linux" 'linux)
+                 (symbol :tag "BSD" 'bsd)
+                 (function :tag "A function for user customized style" ignore))
+  :group 'c)
+
+(defvar c-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    (modify-syntax-entry ?/  ". 124b" table)
+    (modify-syntax-entry ?*  ". 23"   table)
+    table)
+  "Syntax table for `c-ts-mode'.")
+
+(defvar c-ts-mode--indent-styles
+  (let ((common
+         `(((parent-is "translation_unit") parent-bol 0)
+           ((node-is ")") parent 1)
+           ((node-is "]") parent-bol 0)
+           ((node-is "else") parent-bol 0)
+           ((node-is "case") parent-bol 0)
+           ((node-is "comment") no-indent)
+           ((parent-is "comment") no-indent)
+           ((node-is "labeled_statement") parent-bol 0)
+           ((parent-is "labeled_statement") parent-bol c-ts-mode-indent-offset)
+           ((match "preproc_ifdef" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_ifdef") point-min 0)
+           ((match "preproc_if" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_if") point-min 0)
+           ((match "preproc_function_def" "compound_statement") point-min 0)
+           ((match "preproc_call" "compound_statement") point-min 0)
+           ((parent-is "function_definition") parent-bol 0)
+           ((parent-is "conditional_expression") first-sibling 0)
+           ((parent-is "assignment_expression") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "comma_expression") first-sibling 0)
+           ((parent-is "init_declarator") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "parenthesized_expression") first-sibling 1)
+           ((parent-is "argument_list") first-sibling 1)
+           ((parent-is "parameter_list") first-sibling 1)
+           ((parent-is "binary_expression") parent 0)
+           ((query "(for_statement initializer: (_) @indent)") parent-bol 5)
+           ((query "(for_statement condition: (_) @indent)") parent-bol 5)
+           ((query "(for_statement update: (_) @indent)") parent-bol 5)
+           ((query "(call_expression arguments: (_) @indent)") parent c-ts-mode-indent-offset)
+           ((parent-is "call_expression") parent 0)
+           ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset))))
+    `((gnu
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "compound_statement") parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+       ((match "while" "do_statement") parent 0)
+       ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+       (no-node parent-bol c-ts-mode-indent-offset))
+      (k&r
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "compound_statement") grand-parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+       (no-node parent-bol c-ts-mode-indent-offset))
+      (linux
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "compound_statement") grand-parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+       (no-node parent-bol c-ts-mode-indent-offset))
+      (bsd
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "compound_statement") parent 0)
+       ((parent-is "if_statement") parent-bol 0)
+       ((parent-is "for_statement") parent-bol 0)
+       ((parent-is "while_statement") parent-bol 0)
+       ((parent-is "switch_statement") parent-bol 0)
+       ((parent-is "case_statement") parent-bol 0)
+       ((parent-is "do_statement") parent-bol 0)
+       (no-node parent-bol c-ts-mode-indent-offset))))
+  "Indent rules supported by `c-ts-mode'.")
+
+(defun c-ts-mode--set-indent-style ()
+  "Helper function to set indentation style."
+  (let ((style
+         (if (functionp c-ts-mode-indent-style)
+             (funcall c-ts-mode-indent-style)
+           (pcase c-ts-mode-indent-style
+             ('gnu   (alist-get 'gnu c-ts-mode--indent-styles))
+             ('k&r   (alist-get 'k&r c-ts-mode--indent-styles))
+             ('bsd   (alist-get 'bsd c-ts-mode--indent-styles))
+             ('linux (alist-get 'linux c-ts-mode--indent-styles))))))
+    `((c ,@style))))
+
+(defvar c-ts-mode--keywords
+  '("const" "default" "enum" "extern" "inline" "static"
+    "struct" "typedef" "union" "volatile" "goto" "register"
+    "sizeof" "return"
+    "while" "for" "do" "continue" "break"
+    "if" "else" "case" "switch")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar c-ts-mode--preproc-keywords
+  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif"
+    "#include")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar c-ts-mode--operators
+  '("=" "-" "*" "/" "+" "%" "~" "|" "&" "^" "<<" ">>" "->"
+    "." "<" "<=" ">=" ">" "==" "!=" "!" "&&" "||" "-="
+    "+=" "*=" "/=" "%=" "|=" "&=" "^=" ">>=" "<<=" "--" "++")
+  "C operators for tree-sitter font-locking.")
+
+(defvar c-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'c
+   :override t
+   :feature 'comment
+   `((comment) @font-lock-comment-face)
+   :language 'c
+   :override t
+   :feature 'preprocessor
+   `((preproc_directive) @font-lock-preprocessor-face
+
+     (preproc_def
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_ifdef
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_function_def
+      name: (identifier) @font-lock-function-name-face)
+
+     (preproc_params
+      (identifier) @font-lock-variable-name-face)
+
+     (preproc_defined) @font-lock-preprocessor-face
+     (preproc_defined (identifier) @font-lock-variable-name-face)
+     [,@c-ts-mode--preproc-keywords] @font-lock-preprocessor-face)
+   :language 'c
+   :override t
+   :feature 'constant
+   `((true) @font-lock-constant-face
+     (false) @font-lock-constant-face
+     (null) @font-lock-constant-face)
+   :language 'c
+   :override t
+   :feature 'keyword
+   `([,@c-ts-mode--keywords] @font-lock-keyword-face)
+   :language 'c
+   :override t
+   :feature 'operator
+   `([,@c-ts-mode--operators] @font-lock-builtin-face)
+   :language 'c
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face
+     (system_lib_string) @font-lock-string-face
+     (escape_sequence) @font-lock-string-face)
+   :language 'c
+   :override t
+   :feature 'literal
+   `((number_literal) @font-lock-constant-face
+     (char_literal) @font-lock-constant-face)
+   :language 'c
+   :override t
+   :feature 'type
+   '((primitive_type) @font-lock-type-face)
+   :language 'c
+   :override t
+   :feature 'definition
+   `((declaration
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (field_declaration
+      declarator: (field_identifier) @font-lock-variable-name-face)
+
+     (field_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (parameter_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_definition
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_declarator
+      declarator: (identifier) @font-lock-function-name-face)
+
+     (array_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (init_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (struct_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (sized_type_specifier) @font-lock-type-face
+
+     (enum_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (enumerator
+      name: (identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (_) @font-lock-type-face
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (field_identifier) @font-lock-variable-name-face))
+   :language 'c
+   :override t
+   :feature 'expression
+   '((assignment_expression
+      left: (identifier) @font-lock-variable-name-face)
+
+     (call_expression
+      function: (identifier) @font-lock-function-name-face)
+
+     (field_expression
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (field_expression
+      argument: (identifier) @font-lock-variable-name-face
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (pointer_expression
+      argument: (identifier) @font-lock-variable-name-face))
+   :language 'c
+   :override t
+   :feature 'statement
+   '((expression_statement (identifier) @font-lock-variable-name-face))
+   :language 'c
+   :override t
+   :feature 'error
+   '((ERROR) @font-lock-warning-face))
+  "Tree-sitter font-lock settings.")
+
+(defun c-ts-mode--imenu-1 (node)
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'c-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "declarator")
+                          (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun c-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "function_definition"
+                             "struct_specifier")))))
+    (c-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(add-to-list 'auto-mode-alist '("\\.h\\'" . c-ts-mode))
+
+;;;###autoload
+(add-to-list 'auto-mode-alist '("\\.c\\'" . c-ts-mode))
+
+;;;###autoload
+(define-derived-mode c-ts-mode prog-mode "C"
+  "Major mode for editing C, powered by Tree Sitter."
+  :group 'c
+  :syntax-table c-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'c)
+    (error "Tree Sitter for C isn't available"))
+
+  (treesit-parser-create 'c)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (when (eq c-ts-mode-indent-style 'linux)
+    (setq-local indent-tabs-mode t))
+  (setq-local treesit-simple-indent-rules
+              (c-ts-mode--set-indent-style))
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "specifier"
+                      "definition")))
+
+  ;; Electric
+  (setq-local electric-indent-chars
+	      (append "{}():;," electric-indent-chars))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings c-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((comment preprocessor operator constant string literal keyword)
+                (type definition expression statement)
+                (error)))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'c-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'c-ts-mode)
+
+;;; c-ts-mode.el ends here
diff --git a/lisp/progmodes/css-ts-mode.el b/lisp/progmodes/css-ts-mode.el
new file mode 100644
index 0000000000..0cf2fbb689
--- /dev/null
+++ b/lisp/progmodes/css-ts-mode.el
@@ -0,0 +1,122 @@
+;;; css-ts-mode.el --- tree sitter support for CSS  -*- lexical-binding: t; -*-
+
+;; Copyright (C) Theodor Thornhill
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+(require 'treesit)
+(require 'css-mode)
+
+(defcustom css-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'css)
+
+(defvar css-ts-mode--indent-rules
+  `((css
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+
+     ((parent-is "block") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "arguments") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "declaration") parent-bol css-ts-mode-indent-offset))))
+
+(defvar css-ts-mode--settings
+  (treesit-font-lock-rules
+   :language 'css
+   :feature 'basic
+   :override t
+   `(
+     (unit) @font-lock-constant-face
+     (integer_value) @font-lock-builtin-face
+     (float_value) @font-lock-builtin-face
+     (plain_value) @font-lock-variable-name-face
+     (comment) @font-lock-comment-face
+     (class_selector) @css-selector
+     (child_selector) @css-selector
+     (id_selector) @css-selector
+     (tag_name) @css-selector
+     (property_name) @css-property
+     (class_name) @css-selector
+     (function_name) @font-lock-function-name-face)))
+
+(defun css-ts-mode--imenu-1 (node)
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'css-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (if (equal (treesit-node-type ts-node) "tag_name")
+                     (treesit-node-text ts-node)
+                   (treesit-node-text (treesit-node-child ts-node 1) t))))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun css-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_selector"
+                             "id_selector"
+                             "tag_name")))))
+    (css-ts-mode--imenu-1 tree)))
+
+(define-derived-mode css-ts-mode prog-mode "CSS"
+  "Major mode for editing CSS"
+  :group 'css
+  :syntax-table css-mode-syntax-table
+
+  (unless (treesit-ready-p nil 'css)
+    (error "Tree Sitter for CSS isn't available."))
+
+  (treesit-parser-create 'css)
+
+  ;; Comments
+  (setq-local comment-start "/*")
+  (setq-local comment-start-skip "/\\*+[ \t]*")
+  (setq-local comment-end "*/")
+  (setq-local comment-end-skip "[ \t]*\\*+/")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules css-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "rule_set")))
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings css-ts-mode--settings)
+  (setq treesit-font-lock-feature-list '((basic) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'css-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'css-ts-mode)
+
+;;; css-ts-mode.el ends here
diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
new file mode 100644
index 0000000000..0a930f1a65
--- /dev/null
+++ b/lisp/progmodes/java-ts-mode.el
@@ -0,0 +1,282 @@
+;;; java-ts-mode.el --- tree sitter support for Java  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : java languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom java-ts-mode-indent-offset 4
+  "Number of spaces for each indentation step in `java-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'java)
+
+(defvar java-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `java-ts-mode'.")
+
+(defvar java-ts-mode--indent-rules
+  `((java
+     ((parent-is "program") parent-bol 0)
+     ((node-is "}") (and parent parent-bol) 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "class_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "interface_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "constructor_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "enum_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_block") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "record_declaration_body") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block _ @indent))") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block (_) @indent))") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "variable_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "method_invocation") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_rule") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "ternary_expression") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "element_value_array_initializer") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "function_definition") parent-bol 0)
+     ((parent-is "conditional_expression") first-sibling 0)
+     ((parent-is "assignment_expression") parent-bol 2)
+     ((parent-is "binary_expression") parent 0)
+     ((parent-is "parenthesized_expression") first-sibling 1)
+     ((parent-is "argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "annotation_argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "modifiers") parent-bol 0)
+     ((parent-is "formal_parameters") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "formal_parameter") parent-bol 0)
+     ((parent-is "init_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "if_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "for_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "while_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "case_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "labeled_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "do_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "block") (and parent parent-bol) java-ts-mode-indent-offset)))
+  "Tree-sitter indent rules.")
+
+(defvar java-ts-mode--keywords
+  '("abstract" "assert" "break" "case" "catch"
+    "class" "continue" "default" "do" "else"
+    "enum" "exports" "extends" "final" "finally"
+    "for" "if" "implements" "import" "instanceof"
+    "interface" "module" "native" "new" "non-sealed"
+    "open" "opens" "package" "private" "protected"
+    "provides" "public" "requires" "return" "sealed"
+    "static" "strictfp" "switch" "synchronized"
+    "throw" "throws" "to" "transient" "transitive"
+    "try" "uses" "volatile" "while" "with" "record")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar java-ts-mode--operators
+  '("@" "+" ":" "++" "-" "--" "&" "&&" "|" "||"
+    "!=" "==" "*" "/" "%" "<" "<=" ">" ">=" "="
+    "-=" "+=" "*=" "/=" "%=" "->" "^" "^=" "&="
+    "|=" "~" ">>" ">>>" "<<" "::" "?")
+  "C operators for tree-sitter font-locking.")
+
+(defvar java-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'java
+   :override t
+   :feature 'basic
+   '((identifier) @font-lock-variable-name-face)
+   :language 'java
+   :override t
+   :feature 'comment
+   `((line_comment) @font-lock-comment-face
+     (block_comment) @font-lock-comment-face)
+   :language 'java
+   :override t
+   :feature 'constant
+   `(((identifier) @font-lock-constant-face
+      (:match "^[A-Z_][A-Z_\\d]*$" @font-lock-constant-face))
+     (true) @font-lock-constant-face
+     (false) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'keyword
+   `([,@java-ts-mode--keywords] @font-lock-keyword-face
+     (labeled_statement
+      (identifier) @font-lock-keyword-face))
+   :language 'java
+   :override t
+   :feature 'operator
+   `([,@java-ts-mode--operators] @font-lock-builtin-face)
+   :language 'java
+   :override t
+   :feature 'annotation
+   `((annotation
+      name: (identifier) @font-lock-constant-face)
+
+     (marker_annotation
+      name: (identifier) @font-lock-constant-face))
+   :language 'java
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face)
+   :language 'java
+   :override t
+   :feature 'literal
+   `((null_literal) @font-lock-constant-face
+     (decimal_floating_point_literal)  @font-lock-constant-face
+     (hex_floating_point_literal) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'type
+   '((interface_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (class_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (record_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (enum_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (constructor_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (field_access
+      object: (identifier) @font-lock-type-face)
+
+     (method_reference (identifier) @font-lock-type-face)
+
+     ((scoped_identifier name: (identifier) @font-lock-type-face)
+      (:match "^[A-Z]" @font-lock-type-face))
+
+     (type_identifier) @font-lock-type-face
+
+     [(boolean_type)
+      (integral_type)
+      (floating_point_type)
+      (void_type)] @font-lock-type-face)
+   :language 'java
+   :override t
+   :feature 'definition
+   `((method_declaration
+      name: (identifier) @font-lock-function-name-face)
+
+     (formal_parameter
+      name: (identifier) @font-lock-variable-name-face)
+
+     (catch_formal_parameter
+      name: (identifier) @font-lock-variable-name-face))
+   :language 'java
+   :override t
+   :feature 'expression
+   '((method_invocation
+      object: (identifier) @font-lock-variable-name-face)
+
+     (method_invocation
+      name: (identifier) @font-lock-function-name-face)
+
+     (argument_list (identifier) @font-lock-variable-name-face)))
+  "Tree-sitter font-lock settings.")
+
+(defun java-ts-mode--imenu-1 (node)
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'java-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun java-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_declaration"
+                             "interface_declaration"
+                             "enum_declaration"
+                             "record_declaration"
+                             "method_declaration")))))
+    (java-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode java-ts-mode prog-mode "Java"
+  "Major mode for editing Java, powered by Tree Sitter."
+  :group 'c
+  :syntax-table java-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'java)
+    (error "Tree-sitter for Java isn't available."))
+
+  (treesit-parser-create 'java)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules java-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "declaration")))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings java-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((basic comment keyword constant string operator)
+                (type definition expression literal annotation)
+                ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'java-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'java-ts-mode)
+
+;;; java-ts-mode.el ends here
diff --git a/lisp/progmodes/json-ts-mode.el b/lisp/progmodes/json-ts-mode.el
new file mode 100644
index 0000000000..ab693f513a
--- /dev/null
+++ b/lisp/progmodes/json-ts-mode.el
@@ -0,0 +1,141 @@
+;;; json-ts-mode.el --- tree sitter support for JSON  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : json languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom json-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `json-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'json)
+
+(defvar json-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?$ "_"      table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?` "\""     table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `json-ts-mode'.")
+
+
+(defvar json-ts--indent-rules
+  `((json
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "object") parent-bol json-ts-mode-indent-offset))))
+
+(defvar json-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'json
+   :feature 'minimal
+   :override t
+   `(
+     (pair
+      key: (_) @font-lock-string-face)
+
+     (string) @font-lock-string-face
+
+     (number) @font-lock-constant-face
+
+     [(null) (true) (false)] @font-lock-constant-face
+
+     (escape_sequence) @font-lock-constant-face
+
+     (comment) @font-lock-comment-face))
+  "Font-lock settings for JSON.")
+
+(defun json-ts-mode--imenu-1 (node)
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'json-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (treesit-node-text
+                  (or (treesit-node-child-by-field-name
+                       ts-node "key"))
+                  t)))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun json-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "pair")))))
+    (json-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode json-ts-mode prog-mode "JSON"
+  "Major mode for editing JSON, powered by Tree Sitter."
+  :group 'json
+  :syntax-table json-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'json)
+    (error "Tree Sitter for JSON isn't available."))
+
+  (treesit-parser-create 'json)
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules json-ts--indent-rules)
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "pair"
+                      "object")))
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings json-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((minimal) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'json-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'json-ts-mode)
+
+;;; json-ts-mode.el ends here
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-10 17:45 Tree sitter support for C-like languages Theodor Thornhill via Emacs development discussions.
@ 2022-11-10 18:03 ` Stefan Monnier
  2022-11-10 18:18   ` Eli Zaretskii
  2022-11-10 18:19   ` Theodor Thornhill
  2022-11-10 22:58 ` Yuan Fu
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 83+ messages in thread
From: Stefan Monnier @ 2022-11-10 18:03 UTC (permalink / raw)
  To: Theodor Thornhill via Emacs development discussions.
  Cc: Theodor Thornhill, casouri

> See the attached patch for support for several C-like languages.
>
> They all support:
> - Font locking
> - Indentation (with styles for c/c++)
> - Movement
> - Imenu
> - Which-func
>
> These modes are meant as a supplement to tree-sitter.

Thanks.

Have you tried to replace `c-mode` with a simple dispatch function that
either delegates to `c-ts-mode` or to `cc-c-mode`?
(and same for `c++-mode`, of course)


        Stefan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-10 18:03 ` Stefan Monnier
@ 2022-11-10 18:18   ` Eli Zaretskii
  2022-11-10 18:19   ` Theodor Thornhill
  1 sibling, 0 replies; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-10 18:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel, theo, casouri

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Theodor Thornhill <theo@thornhill.no>,  casouri@gmail.com
> Date: Thu, 10 Nov 2022 13:03:52 -0500
> 
> Have you tried to replace `c-mode` with a simple dispatch function that
> either delegates to `c-ts-mode` or to `cc-c-mode`?
> (and same for `c++-mode`, of course)

I think we shouldn't rush with such changes.  I think Emacs 29 should
have the tree-sitter based modes as optional features that users
should actively opt in.  We should defer automation like the one you
propose for later, once we have more user experience and feedback.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-10 18:03 ` Stefan Monnier
  2022-11-10 18:18   ` Eli Zaretskii
@ 2022-11-10 18:19   ` Theodor Thornhill
  1 sibling, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-10 18:19 UTC (permalink / raw)
  To: Stefan Monnier, Theodor Thornhill via Emacs development discussions.
  Cc: casouri



On 10 November 2022 19:03:52 CET, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> See the attached patch for support for several C-like languages.
>>
>> They all support:
>> - Font locking
>> - Indentation (with styles for c/c++)
>> - Movement
>> - Imenu
>> - Which-func
>>
>> These modes are meant as a supplement to tree-sitter.
>
>Thanks.
>
>Have you tried to replace `c-mode` with a simple dispatch function that
>either delegates to `c-ts-mode` or to `cc-c-mode`?
>(and same for `c++-mode`, of course)
>
>
>        Stefan
>

No not yet, i believe Eli thinks that's a little premature - If we want that I can make such an attempt. I'll let the grown-ups decide :)



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-10 17:45 Tree sitter support for C-like languages Theodor Thornhill via Emacs development discussions.
  2022-11-10 18:03 ` Stefan Monnier
@ 2022-11-10 22:58 ` Yuan Fu
  2022-11-11  5:48   ` Theodor Thornhill
  2022-11-11  6:01   ` Theodor Thornhill via Emacs development discussions.
  2022-11-11  0:43 ` Randy Taylor
  2022-11-16 17:51 ` Yuan Fu
  3 siblings, 2 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-10 22:58 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1006 bytes --]



> On Nov 10, 2022, at 9:45 AM, Theodor Thornhill <theo@thornhill.no> wrote:
> 
> 
> 
> Hi all,
> 
> See the attached patch for support for several C-like languages.
> 
> They all support:
> - Font locking
> - Indentation (with styles for c/c++)
> - Movement
> - Imenu
> - Which-func
> 
> These modes are meant as a supplement to tree-sitter.
> 
> I'm hopeful for some constructive criticism, and some testing.  This
> patch needs to be applied to the feature/tree-sitter branch, and should
> hopefully be applied there before we merge the branch to master, well
> before Emacs 29 is cut.
> 
> I hope you like it,
> 
> Theo

This is fantastic! I’m trying them out right now :-)

Some things I noticed:

The indentation for the closing bracket of a struct is off:

struct regexp_cache
{
  struct regexp_cache *next;
  };

Imenu has some duplicate entries, the patch below should fix that.

I also added the new contextual thingy to font-lock settings.

Yuan


[-- Attachment #2: cc-modes.diff --]
[-- Type: application/octet-stream, Size: 1329 bytes --]

diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index 9da20485af4..b48ab667311 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -185,7 +185,8 @@ c-ts-mode--font-lock-settings
    :language 'c
    :override t
    :feature 'comment
-   `((comment) @font-lock-comment-face)
+   `((comment) @font-lock-comment-face
+     (comment) @contexual)
    :language 'c
    :override t
    :feature 'preprocessor
@@ -224,6 +225,7 @@ c-ts-mode--font-lock-settings
    :override t
    :feature 'string
    `((string_literal) @font-lock-string-face
+     ((string_literal)) @contextual
      (system_lib_string) @font-lock-string-face
      (escape_sequence) @font-lock-string-face)
    :language 'c
@@ -327,7 +329,14 @@ c-ts-mode--imenu-1
          (marker (when ts-node
                    (set-marker (make-marker)
                                (treesit-node-start ts-node)))))
+    ;; A struct_specifier could be inside a parameter list or another
+    ;; struct definition.  In those cases we don't include it.
     (cond
+     ((string-match-p
+       (rx (or "parameter" "field") "_declaration")
+       (or (treesit-node-type (treesit-node-parent ts-node))
+           ""))
+      nil)
      ((null ts-node) subtrees)
      (subtrees
       `((,name ,(cons name marker) ,@subtrees)))

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-10 17:45 Tree sitter support for C-like languages Theodor Thornhill via Emacs development discussions.
  2022-11-10 18:03 ` Stefan Monnier
  2022-11-10 22:58 ` Yuan Fu
@ 2022-11-11  0:43 ` Randy Taylor
  2022-11-11  5:50   ` Theodor Thornhill
  2022-11-16 17:51 ` Yuan Fu
  3 siblings, 1 reply; 83+ messages in thread
From: Randy Taylor @ 2022-11-11  0:43 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel, casouri

On Thursday, November 10th, 2022 at 12:45, Theodor Thornhill via "Emacs development discussions." <emacs-devel@gnu.org> wrote:

> 
> Hi all,
> 
> See the attached patch for support for several C-like languages.
> 
> They all support:
> - Font locking
> - Indentation (with styles for c/c++)
> - Movement
> - Imenu
> - Which-func
> 
> These modes are meant as a supplement to tree-sitter.
> 
> I'm hopeful for some constructive criticism, and some testing. This
> patch needs to be applied to the feature/tree-sitter branch, and should
> hopefully be applied there before we merge the branch to master, well
> before Emacs 29 is cut.
> 
> I hope you like it,
> 
> Theo

Thanks Theo! I was actually just about to email you about this stuff.

I will test drive the C and C++ stuff.

A few comments:
- (This one is for everyone) For lists of things (like keywords, etc.) and the font-lock rules, can we consider alphabetizing them (for the font-lock rules it would be by feature)? It would make them a lot easier to scan for what's there/not there and give them some order (and we could then be consistent everywhere). I am happy to send a patch for this.
- I recently added some new font-lock faces for tree-sitter that you may wish to make use of: http://git.savannah.gnu.org/cgit/emacs.git/commit/?h=feature/tree-sitter&id=e06953b02a0e7b26b33c511a22896d0db4e5d63d
  - I am happy to send a patch adding support for them.
- For my C++ configuration, my treesit font lock rules has 2 queries: one for C and one for C++. The C one has all the usual C stuff, and the C++ one has ONLY C++-related things. Could we do something similar here? That would really reduce the duplication in the C++ file.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-10 22:58 ` Yuan Fu
@ 2022-11-11  5:48   ` Theodor Thornhill
  2022-11-11  6:01   ` Theodor Thornhill via Emacs development discussions.
  1 sibling, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-11  5:48 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel



On 10 November 2022 23:58:12 CET, Yuan Fu <casouri@gmail.com> wrote:
>
>
>> On Nov 10, 2022, at 9:45 AM, Theodor Thornhill <theo@thornhill.no> wrote:
>> 
>> 
>> 
>> Hi all,
>> 
>> See the attached patch for support for several C-like languages.
>> 
>> They all support:
>> - Font locking
>> - Indentation (with styles for c/c++)
>> - Movement
>> - Imenu
>> - Which-func
>> 
>> These modes are meant as a supplement to tree-sitter.
>> 
>> I'm hopeful for some constructive criticism, and some testing.  This
>> patch needs to be applied to the feature/tree-sitter branch, and should
>> hopefully be applied there before we merge the branch to master, well
>> before Emacs 29 is cut.
>> 
>> I hope you like it,
>> 
>> Theo
>
>This is fantastic! I’m trying them out right now :-)
>
>Some things I noticed:
>
>The indentation for the closing bracket of a struct is off:
>
>struct regexp_cache
>{
>  struct regexp_cache *next;
>  };
>
>Imenu has some duplicate entries, the patch below should fix that.
>
>I also added the new contextual thingy to font-lock settings.
>
>Yuan
>

Yeah, I had a regression. Sent a fixed patch. Sorry for that...



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-11  0:43 ` Randy Taylor
@ 2022-11-11  5:50   ` Theodor Thornhill
  2022-11-11 13:37     ` Stefan Monnier
  2022-11-11 15:54     ` Randy Taylor
  0 siblings, 2 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-11  5:50 UTC (permalink / raw)
  To: Randy Taylor; +Cc: emacs-devel, casouri



On 11 November 2022 01:43:27 CET, Randy Taylor <dev@rjt.dev> wrote:
>On Thursday, November 10th, 2022 at 12:45, Theodor Thornhill via "Emacs development discussions." <emacs-devel@gnu.org> wrote:
>
>> 
>> Hi all,
>> 
>> See the attached patch for support for several C-like languages.
>> 
>> They all support:
>> - Font locking
>> - Indentation (with styles for c/c++)
>> - Movement
>> - Imenu
>> - Which-func
>> 
>> These modes are meant as a supplement to tree-sitter.
>> 
>> I'm hopeful for some constructive criticism, and some testing. This
>> patch needs to be applied to the feature/tree-sitter branch, and should
>> hopefully be applied there before we merge the branch to master, well
>> before Emacs 29 is cut.
>> 
>> I hope you like it,
>> 
>> Theo
>
>Thanks Theo! I was actually just about to email you about this stuff.
>
>I will test drive the C and C++ stuff.
>
>A few comments:
>- (This one is for everyone) For lists of things (like keywords, etc.) and the font-lock rules, can we consider alphabetizing them (for the font-lock rules it would be by feature)? It would make them a lot easier to scan for what's there/not there and give them some order (and we could then be consistent everywhere). I am happy to send a patch for this.
>- I recently added some new font-lock faces for tree-sitter that you may wish to make use of: http://git.savannah.gnu.org/cgit/emacs.git/commit/?h=feature/tree-sitter&id=e06953b02a0e7b26b33c511a22896d0db4e5d63d
>  - I am happy to send a patch adding support for them.
>- For my C++ configuration, my treesit font lock rules has 2 queries: one for C and one for C++. The C one has all the usual C stuff, and the C++ one has ONLY C++-related things. Could we do something similar here? That would really reduce the duplication in the C++ file.

Yeah, I'm interested in reducing duplication, but not for that reason only. But I'm thinking of ways to make one inherit the other.

How about we first get this merged, then new faces on top of that?



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-10 22:58 ` Yuan Fu
  2022-11-11  5:48   ` Theodor Thornhill
@ 2022-11-11  6:01   ` Theodor Thornhill via Emacs development discussions.
  2022-11-12  5:43     ` Yuan Fu
  2022-11-12 12:21     ` Eli Zaretskii
  1 sibling, 2 replies; 83+ messages in thread
From: Theodor Thornhill via Emacs development discussions. @ 2022-11-11  6:01 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1231 bytes --]

Yuan Fu <casouri@gmail.com> writes:

>> On Nov 10, 2022, at 9:45 AM, Theodor Thornhill <theo@thornhill.no> wrote:
>> 
>> 
>> 
>> Hi all,
>> 
>> See the attached patch for support for several C-like languages.
>> 
>> They all support:
>> - Font locking
>> - Indentation (with styles for c/c++)
>> - Movement
>> - Imenu
>> - Which-func
>> 
>> These modes are meant as a supplement to tree-sitter.
>> 
>> I'm hopeful for some constructive criticism, and some testing.  This
>> patch needs to be applied to the feature/tree-sitter branch, and should
>> hopefully be applied there before we merge the branch to master, well
>> before Emacs 29 is cut.
>> 
>> I hope you like it,
>> 
>> Theo
>
> This is fantastic! I’m trying them out right now :-)
>
> Some things I noticed:
>
> The indentation for the closing bracket of a struct is off:
>
> struct regexp_cache
> {
>   struct regexp_cache *next;
>   };
>

Yeah.

> Imenu has some duplicate entries, the patch below should fix that.
>
> I also added the new contextual thingy to font-lock settings.

Thanks, added!

See below patch.  The struct issue was an ordering problem when I
extracted the styles.  Should be good now

Theo


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-Tree-Sitter-modes-for-C-like-languages.patch --]
[-- Type: text/x-diff, Size: 56647 bytes --]

From 92cf837414155b869d6193c4e24eb15291d1b64c Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Thu, 10 Nov 2022 17:15:49 +0100
Subject: [PATCH] Add Tree Sitter modes for C-like languages

* etc/NEWS: Mention the new modes
* lisp/progmodes/c++-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/c-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/java-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/json-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/css-ts-mode.el: New major mode with Tree Sitter support.
---
 etc/NEWS                       |  28 ++-
 lisp/progmodes/c++-ts-mode.el  | 424 +++++++++++++++++++++++++++++++++
 lisp/progmodes/c-ts-mode.el    | 414 ++++++++++++++++++++++++++++++++
 lisp/progmodes/css-ts-mode.el  | 131 ++++++++++
 lisp/progmodes/java-ts-mode.el | 289 ++++++++++++++++++++++
 lisp/progmodes/json-ts-mode.el | 150 ++++++++++++
 6 files changed, 1434 insertions(+), 2 deletions(-)
 create mode 100644 lisp/progmodes/c++-ts-mode.el
 create mode 100644 lisp/progmodes/c-ts-mode.el
 create mode 100644 lisp/progmodes/css-ts-mode.el
 create mode 100644 lisp/progmodes/java-ts-mode.el
 create mode 100644 lisp/progmodes/json-ts-mode.el

diff --git a/etc/NEWS b/etc/NEWS
index 9ed78bc6b3..3ce9810ece 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -2786,8 +2786,32 @@ when visiting JSON files.
 ** New mode ts-mode'.
 Support is added for TypeScript, based on the new integration with
 Tree-Sitter. There's support for font-locking, indentation and
-navigation.  Tree-Sitter is required for this mode to function, but if
-it is not available, we will default to use 'js-mode'.
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode c-ts-mode'.
+Support is added for C, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode c++-ts-mode'.
+Support is added for c++, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode java-ts-mode'.
+Support is added for Java, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode css-ts-mode'.
+Support is added for CSS, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode json-ts-mode'.
+Support is added for JSON, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
 
 \f
 * Incompatible Lisp Changes in Emacs 29.1
diff --git a/lisp/progmodes/c++-ts-mode.el b/lisp/progmodes/c++-ts-mode.el
new file mode 100644
index 0000000000..45668f414e
--- /dev/null
+++ b/lisp/progmodes/c++-ts-mode.el
@@ -0,0 +1,424 @@
+;;; c++-ts-mode.el --- tree sitter support for C++  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : c++ languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom c++-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `c++-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'cpp)
+
+(defcustom c++-ts-mode-indent-style 'gnu
+  "Style used for indentation.
+
+The selected style could be one of GNU, K&R, LINUX or BSD.  If
+one of the supplied styles doesn't suffice a function could be
+set instead.  This function is expected return a list that
+follows the form of `treesit-simple-indent-rules'."
+  :type '(choice (symbol :tag "Gnu" 'gnu)
+                 (symbol :tag "K&R" 'k&r)
+                 (symbol :tag "Linux" 'linux)
+                 (symbol :tag "BSD" 'bsd)
+                 (function :tag "A function for user customized style" ignore))
+  :group 'cpp)
+
+(defvar c++-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    (modify-syntax-entry ?/  ". 124b" table)
+    (modify-syntax-entry ?*  ". 23"   table)
+    table)
+  "Syntax table for `c++-ts-mode'.")
+
+(defvar c++-ts-mode--indent-styles
+  (let ((common
+         `(((parent-is "translation_unit") parent-bol 0)
+           ((node-is ")") parent 1)
+           ((node-is "]") parent-bol 0)
+           ((node-is "else") parent-bol 0)
+           ((node-is "case") parent-bol 0)
+           ((node-is "comment") no-indent)
+           ((parent-is "comment") no-indent)
+           ((node-is "labeled_statement") parent-bol 0)
+           ((parent-is "labeled_statement") parent-bol c++-ts-mode-indent-offset)
+           ((match "preproc_ifdef" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_ifdef") point-min 0)
+           ((match "preproc_if" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_if") point-min 0)
+           ((match "preproc_function_def" "compound_statement") point-min 0)
+           ((match "preproc_call" "compound_statement") point-min 0)
+           ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+           ((node-is "field_initializer_list") parent-bol ,(* c++-ts-mode-indent-offset 2))
+           ((parent-is "function_definition") parent-bol 0)
+           ((parent-is "conditional_expression") first-sibling 0)
+           ((parent-is "assignment_expression") parent-bol c++-ts-mode-indent-offset)
+           ((parent-is "comma_expression") first-sibling 0)
+           ((parent-is "init_declarator") parent-bol c++-ts-mode-indent-offset)
+           ((parent-is "parenthesized_expression") first-sibling 1)
+           ((parent-is "argument_list") first-sibling 1)
+           ((parent-is "parameter_list") first-sibling 1)
+           ((parent-is "binary_expression") parent 0)
+           ((query "(for_statement initializer: (_) @indent)") parent-bol 5)
+           ((query "(for_statement condition: (_) @indent)") parent-bol 5)
+           ((query "(for_statement update: (_) @indent)") parent-bol 5)
+           ((query "(call_expression arguments: (_) @indent)") parent c++-ts-mode-indent-offset))))
+    `((gnu
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent c++-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((match "while" "do_statement") parent 0)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (k&r
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c++-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (linux
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c++-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (bsd
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent 0)
+       ((parent-is "if_statement") parent-bol 0)
+       ((parent-is "for_statement") parent-bol 0)
+       ((parent-is "while_statement") parent-bol 0)
+       ((parent-is "switch_statement") parent-bol 0)
+       ((parent-is "case_statement") parent-bol 0)
+       ((parent-is "do_statement") parent-bol 0)
+       (no-node parent-bol c++-ts-mode-indent-offset))))
+  "Indent rules supported by `c++-ts-mode'.")
+
+(defun c++-ts-mode--set-indent-style ()
+  "Helper function to set indentation style."
+  (let ((style
+         (if (functionp c++-ts-mode-indent-style)
+             (funcall c++-ts-mode-indent-style)
+           (pcase c++-ts-mode-indent-style
+             ('gnu   (alist-get 'gnu c++-ts-mode--indent-styles))
+             ('k&r   (alist-get 'k&r c++-ts-mode--indent-styles))
+             ('bsd   (alist-get 'bsd c++-ts-mode--indent-styles))
+             ('linux (alist-get 'linux c++-ts-mode--indent-styles))))))
+    `((cpp ,@style))))
+
+(defvar c++-ts-mode--keywords
+  '("and" "and_eq" "bitand" "bitor" "catch" "class" "co_await"
+    "co_return" "co_yield" "compl" "concept" "consteval" "constexpr"
+    "constinit" "decltype" "delete" "else" "explicit" "final" "for"
+    "friend" "friend" "if" "mutable" "namespace" "new" "noexcept"
+    "not" "not_eq" "operator" "or" "or_eq" "override" "private"
+    "protected" "public" "requires" "return" "static" "struct"
+    "template" "throw" "try" "typename" "using" "virtual" "xor" "xor_eq"
+    "switch" "case")
+  "C++ keywords for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--preproc-keywords
+  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif" "#include")
+  "C++ keywords for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--operators
+  '("=" "-" "*" "/" "+" "%" "~" "|" "&" "^" "<<" ">>" "->"
+    "." "<" "<=" ">=" ">" "==" "!=" "!" "&&" "||" "-="
+    "+=" "*=" "/=" "%=" "|=" "&=" "^=" ">>=" "<<=" "--" "++")
+  "C++ operators for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'cpp
+   :override t
+   :feature 'comment
+   `((comment) @font-lock-comment-face
+     (comment) @contexual)
+   :language 'cpp
+   :override t
+   :feature 'preprocessor
+   `((preproc_directive) @font-lock-preprocessor-face
+
+     (preproc_def
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_ifdef
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_function_def
+      name: (identifier) @font-lock-function-name-face)
+
+     (preproc_params
+      (identifier) @font-lock-variable-name-face)
+
+     (preproc_defined) @font-lock-preprocessor-face
+     (preproc_defined (identifier) @font-lock-variable-name-face)
+     [,@c++-ts-mode--preproc-keywords] @font-lock-preprocessor-face)
+   :language 'cpp
+   :override t
+   :feature 'constant
+   `((true) @font-lock-constant-face
+     (false) @font-lock-constant-face
+     (null) @font-lock-constant-face
+     (this) @font-lock-constant-face)
+   :language 'cpp
+   :override t
+   :feature 'keyword
+   `([,@c++-ts-mode--keywords] @font-lock-keyword-face
+     (auto) @font-lock-keyword-face)
+   :language 'cpp
+   :override t
+   :feature 'operator
+   `([,@c++-ts-mode--operators] @font-lock-builtin-face)
+   :language 'cpp
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face
+     ((string_literal)) @contextual
+     (system_lib_string) @font-lock-string-face
+     (escape_sequence) @font-lock-string-face)
+   :language 'cpp
+   :override t
+   :feature 'literal
+   `((number_literal) @font-lock-constant-face
+     (char_literal) @font-lock-constant-face)
+   :language 'cpp
+   :override t
+   :feature 'type
+   '((primitive_type) @font-lock-type-face
+     (type_qualifier) @font-lock-type-face
+
+     (qualified_identifier
+      scope: (namespace_identifier) @font-lock-type-face)
+
+     (operator_cast)  type: (type_identifier) @font-lock-type-face)
+   :language 'cpp
+   :override t
+   :feature 'definition
+   `((declaration
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (field_declaration
+      declarator: (field_identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_definition
+      type: (type_identifier) @font-lock-function-name-face)
+
+     (function_declarator
+      declarator: (identifier) @font-lock-function-name-face)
+
+     (array_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (init_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (struct_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (sized_type_specifier) @font-lock-type-face
+
+     (enum_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (enumerator
+      name: (identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (_) @font-lock-type-face
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (field_identifier) @font-lock-variable-name-face))
+   :language 'cpp
+   :override t
+   :feature 'expression
+   '((assignment_expression
+      left: (identifier) @font-lock-variable-name-face)
+
+     (call_expression
+      function: (identifier) @font-lock-function-name-face)
+
+     (field_expression
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (field_expression
+      argument: (identifier) @font-lock-variable-name-face
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (pointer_expression
+      argument: (identifier) @font-lock-variable-name-face))
+   :language 'cpp
+   :override t
+   :feature 'statement
+   '((expression_statement (identifier) @font-lock-variable-name-face)
+     (labeled_statement
+      label: (statement_identifier) @font-lock-type-face))
+   :language 'cpp
+   :override t
+   :feature 'error
+   '((ERROR) @font-lock-warning-face))
+  "Tree-sitter font-lock settings.")
+
+(defun c++-ts-mode--imenu-1 (node)
+  "Helper for `c++-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'c++-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "declarator")
+                          (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    ;; A struct_specifier could be inside a parameter list or another
+    ;; struct definition.  In those cases we don't include it.
+    (cond
+     ((string-match-p
+       (rx (or "parameter" "field") "_declaration")
+       (or (treesit-node-type (treesit-node-parent ts-node))
+           ""))
+      nil)
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun c++-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "function_definition"
+                             "struct_specifier")))))
+    (c++-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode c++-ts-mode prog-mode "C++"
+  "Major mode for editing C++, powered by Tree Sitter."
+  :group 'cpp
+  :syntax-table c++-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'cpp)
+    (error "Tree Sitter for C++ isn't available"))
+
+  (treesit-parser-create 'cpp)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (when (eq c++-ts-mode-indent-style 'linux)
+    (setq-local indent-tabs-mode t))
+  (setq-local treesit-simple-indent-rules
+              (c++-ts-mode--set-indent-style))
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "specifier"
+                      "definition")))
+
+  ;; Electric
+  (setq-local electric-indent-chars
+	      (append "{}():;," electric-indent-chars))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings c++-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((comment preprocessor operator constant string literal keyword)
+                (type definition expression statement)
+                (error)))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'c++-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'c++-ts-mode)
+
+;;; c++-ts-mode.el ends here
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
new file mode 100644
index 0000000000..dc5c6b2520
--- /dev/null
+++ b/lisp/progmodes/c-ts-mode.el
@@ -0,0 +1,414 @@
+;;; c-ts-mode.el --- tree sitter support for C  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : c languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom c-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `c-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'c)
+
+(defcustom c-ts-mode-indent-style 'gnu
+  "Style used for indentation.
+
+The selected style could be one of GNU, K&R, LINUX or BSD.  If
+one of the supplied styles doesn't suffice a function could be
+set instead.  This function is expected return a list that
+follows the form of `treesit-simple-indent-rules'."
+  :type '(choice (symbol :tag "Gnu" 'gnu)
+                 (symbol :tag "K&R" 'k&r)
+                 (symbol :tag "Linux" 'linux)
+                 (symbol :tag "BSD" 'bsd)
+                 (function :tag "A function for user customized style" ignore))
+  :group 'c)
+
+(defvar c-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    (modify-syntax-entry ?/  ". 124b" table)
+    (modify-syntax-entry ?*  ". 23"   table)
+    table)
+  "Syntax table for `c-ts-mode'.")
+
+(defvar c-ts-mode--indent-styles
+  (let ((common
+         `(((parent-is "translation_unit") parent-bol 0)
+           ((node-is ")") parent 1)
+           ((node-is "]") parent-bol 0)
+           ((node-is "else") parent-bol 0)
+           ((node-is "case") parent-bol 0)
+           ((node-is "comment") no-indent)
+           ((parent-is "comment") no-indent)
+           ((node-is "labeled_statement") parent-bol 0)
+           ((parent-is "labeled_statement") parent-bol c-ts-mode-indent-offset)
+           ((match "preproc_ifdef" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_ifdef") point-min 0)
+           ((match "preproc_if" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_if") point-min 0)
+           ((match "preproc_function_def" "compound_statement") point-min 0)
+           ((match "preproc_call" "compound_statement") point-min 0)
+           ((parent-is "function_definition") parent-bol 0)
+           ((parent-is "conditional_expression") first-sibling 0)
+           ((parent-is "assignment_expression") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "comma_expression") first-sibling 0)
+           ((parent-is "init_declarator") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "parenthesized_expression") first-sibling 1)
+           ((parent-is "argument_list") first-sibling 1)
+           ((parent-is "parameter_list") first-sibling 1)
+           ((parent-is "binary_expression") parent 0)
+           ((query "(for_statement initializer: (_) @indent)") parent-bol 5)
+           ((query "(for_statement condition: (_) @indent)") parent-bol 5)
+           ((query "(for_statement update: (_) @indent)") parent-bol 5)
+           ((query "(call_expression arguments: (_) @indent)") parent c-ts-mode-indent-offset)
+           ((parent-is "call_expression") parent 0))))
+    `((gnu
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+       ((match "while" "do_statement") parent 0)
+       ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+       (no-node parent-bol c-ts-mode-indent-offset))
+      (k&r
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+       (no-node parent-bol c-ts-mode-indent-offset))
+      (linux
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+       (no-node parent-bol c-ts-mode-indent-offset))
+      (bsd
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent 0)
+       ((parent-is "if_statement") parent-bol 0)
+       ((parent-is "for_statement") parent-bol 0)
+       ((parent-is "while_statement") parent-bol 0)
+       ((parent-is "switch_statement") parent-bol 0)
+       ((parent-is "case_statement") parent-bol 0)
+       ((parent-is "do_statement") parent-bol 0)
+       (no-node parent-bol c-ts-mode-indent-offset))))
+  "Indent rules supported by `c-ts-mode'.")
+
+(defun c-ts-mode--set-indent-style ()
+  "Helper function to set indentation style."
+  (let ((style
+         (if (functionp c-ts-mode-indent-style)
+             (funcall c-ts-mode-indent-style)
+           (pcase c-ts-mode-indent-style
+             ('gnu   (alist-get 'gnu c-ts-mode--indent-styles))
+             ('k&r   (alist-get 'k&r c-ts-mode--indent-styles))
+             ('bsd   (alist-get 'bsd c-ts-mode--indent-styles))
+             ('linux (alist-get 'linux c-ts-mode--indent-styles))))))
+    `((c ,@style))))
+
+(defvar c-ts-mode--keywords
+  '("const" "default" "enum" "extern" "inline" "static"
+    "struct" "typedef" "union" "volatile" "goto" "register"
+    "sizeof" "return"
+    "while" "for" "do" "continue" "break"
+    "if" "else" "case" "switch")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar c-ts-mode--preproc-keywords
+  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif"
+    "#include")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar c-ts-mode--operators
+  '("=" "-" "*" "/" "+" "%" "~" "|" "&" "^" "<<" ">>" "->"
+    "." "<" "<=" ">=" ">" "==" "!=" "!" "&&" "||" "-="
+    "+=" "*=" "/=" "%=" "|=" "&=" "^=" ">>=" "<<=" "--" "++")
+  "C operators for tree-sitter font-locking.")
+
+(defvar c-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'c
+   :override t
+   :feature 'comment
+   `((comment) @font-lock-comment-face
+     (comment) @contexual)
+   :language 'c
+   :override t
+   :feature 'preprocessor
+   `((preproc_directive) @font-lock-preprocessor-face
+
+     (preproc_def
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_ifdef
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_function_def
+      name: (identifier) @font-lock-function-name-face)
+
+     (preproc_params
+      (identifier) @font-lock-variable-name-face)
+
+     (preproc_defined) @font-lock-preprocessor-face
+     (preproc_defined (identifier) @font-lock-variable-name-face)
+     [,@c-ts-mode--preproc-keywords] @font-lock-preprocessor-face)
+   :language 'c
+   :override t
+   :feature 'constant
+   `((true) @font-lock-constant-face
+     (false) @font-lock-constant-face
+     (null) @font-lock-constant-face)
+   :language 'c
+   :override t
+   :feature 'keyword
+   `([,@c-ts-mode--keywords] @font-lock-keyword-face)
+   :language 'c
+   :override t
+   :feature 'operator
+   `([,@c-ts-mode--operators] @font-lock-builtin-face)
+   :language 'c
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face
+     ((string_literal)) @contextual
+     (system_lib_string) @font-lock-string-face
+     (escape_sequence) @font-lock-string-face)
+   :language 'c
+   :override t
+   :feature 'literal
+   `((number_literal) @font-lock-constant-face
+     (char_literal) @font-lock-constant-face)
+   :language 'c
+   :override t
+   :feature 'type
+   '((primitive_type) @font-lock-type-face)
+   :language 'c
+   :override t
+   :feature 'definition
+   `((declaration
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (field_declaration
+      declarator: (field_identifier) @font-lock-variable-name-face)
+
+     (field_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (parameter_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_definition
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_declarator
+      declarator: (identifier) @font-lock-function-name-face)
+
+     (array_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (init_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (struct_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (sized_type_specifier) @font-lock-type-face
+
+     (enum_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (enumerator
+      name: (identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (_) @font-lock-type-face
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (field_identifier) @font-lock-variable-name-face))
+   :language 'c
+   :override t
+   :feature 'expression
+   '((assignment_expression
+      left: (identifier) @font-lock-variable-name-face)
+
+     (call_expression
+      function: (identifier) @font-lock-function-name-face)
+
+     (field_expression
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (field_expression
+      argument: (identifier) @font-lock-variable-name-face
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (pointer_expression
+      argument: (identifier) @font-lock-variable-name-face))
+   :language 'c
+   :override t
+   :feature 'statement
+   '((expression_statement (identifier) @font-lock-variable-name-face))
+   :language 'c
+   :override t
+   :feature 'error
+   '((ERROR) @font-lock-warning-face))
+  "Tree-sitter font-lock settings.")
+
+(defun c-ts-mode--imenu-1 (node)
+  "Helper for `c-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'c-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "declarator")
+                          (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    ;; A struct_specifier could be inside a parameter list or another
+    ;; struct definition.  In those cases we don't include it.
+    (cond
+     ((string-match-p
+       (rx (or "parameter" "field") "_declaration")
+       (or (treesit-node-type (treesit-node-parent ts-node))
+           ""))
+      nil)
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun c-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "function_definition"
+                             "struct_specifier")))))
+    (c-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode c-ts-mode prog-mode "C"
+  "Major mode for editing C, powered by Tree Sitter."
+  :group 'c
+  :syntax-table c-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'c)
+    (error "Tree Sitter for C isn't available"))
+
+  (treesit-parser-create 'c)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (when (eq c-ts-mode-indent-style 'linux)
+    (setq-local indent-tabs-mode t))
+  (setq-local treesit-simple-indent-rules
+              (c-ts-mode--set-indent-style))
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "specifier"
+                      "definition")))
+
+  ;; Electric
+  (setq-local electric-indent-chars
+	      (append "{}():;," electric-indent-chars))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings c-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((comment preprocessor operator constant string literal keyword)
+                (type definition expression statement)
+                (error)))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'c-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'c-ts-mode)
+
+;;; c-ts-mode.el ends here
diff --git a/lisp/progmodes/css-ts-mode.el b/lisp/progmodes/css-ts-mode.el
new file mode 100644
index 0000000000..c1a8d4e94d
--- /dev/null
+++ b/lisp/progmodes/css-ts-mode.el
@@ -0,0 +1,131 @@
+;;; css-ts-mode.el --- tree sitter support for CSS  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : css languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+(require 'css-mode)
+
+(defcustom css-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'css)
+
+(defvar css-ts-mode--indent-rules
+  `((css
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+
+     ((parent-is "block") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "arguments") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "declaration") parent-bol css-ts-mode-indent-offset))))
+
+(defvar css-ts-mode--settings
+  (treesit-font-lock-rules
+   :language 'css
+   :feature 'basic
+   :override t
+   `((unit) @font-lock-constant-face
+     (integer_value) @font-lock-builtin-face
+     (float_value) @font-lock-builtin-face
+     (plain_value) @font-lock-variable-name-face
+     (comment) @font-lock-comment-face
+     (class_selector) @css-selector
+     (child_selector) @css-selector
+     (id_selector) @css-selector
+     (tag_name) @css-selector
+     (property_name) @css-property
+     (class_name) @css-selector
+     (function_name) @font-lock-function-name-face)))
+
+(defun css-ts-mode--imenu-1 (node)
+  "Helper for `css-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'css-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (if (equal (treesit-node-type ts-node) "tag_name")
+                     (treesit-node-text ts-node)
+                   (treesit-node-text (treesit-node-child ts-node 1) t))))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun css-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_selector"
+                             "id_selector"
+                             "tag_name")))))
+    (css-ts-mode--imenu-1 tree)))
+
+(define-derived-mode css-ts-mode prog-mode "CSS"
+  "Major mode for editing CSS."
+  :group 'css
+  :syntax-table css-mode-syntax-table
+
+  (unless (treesit-ready-p nil 'css)
+    (error "Tree Sitter for CSS isn't available"))
+
+  (treesit-parser-create 'css)
+
+  ;; Comments
+  (setq-local comment-start "/*")
+  (setq-local comment-start-skip "/\\*+[ \t]*")
+  (setq-local comment-end "*/")
+  (setq-local comment-end-skip "[ \t]*\\*+/")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules css-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp "rule_set")
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings css-ts-mode--settings)
+  (setq treesit-font-lock-feature-list '((basic) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'css-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'css-ts-mode)
+
+;;; css-ts-mode.el ends here
diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
new file mode 100644
index 0000000000..734a8be471
--- /dev/null
+++ b/lisp/progmodes/java-ts-mode.el
@@ -0,0 +1,289 @@
+;;; java-ts-mode.el --- tree sitter support for Java  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : java languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom java-ts-mode-indent-offset 4
+  "Number of spaces for each indentation step in `java-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'java)
+
+(defvar java-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `java-ts-mode'.")
+
+(defvar java-ts-mode--indent-rules
+  `((java
+     ((parent-is "program") parent-bol 0)
+     ((node-is "}") (and parent parent-bol) 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "class_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "interface_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "constructor_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "enum_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_block") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "record_declaration_body") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block _ @indent))") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block (_) @indent))") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "variable_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "method_invocation") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_rule") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "ternary_expression") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "element_value_array_initializer") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "function_definition") parent-bol 0)
+     ((parent-is "conditional_expression") first-sibling 0)
+     ((parent-is "assignment_expression") parent-bol 2)
+     ((parent-is "binary_expression") parent 0)
+     ((parent-is "parenthesized_expression") first-sibling 1)
+     ((parent-is "argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "annotation_argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "modifiers") parent-bol 0)
+     ((parent-is "formal_parameters") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "formal_parameter") parent-bol 0)
+     ((parent-is "init_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "if_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "for_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "while_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "case_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "labeled_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "do_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "block") (and parent parent-bol) java-ts-mode-indent-offset)))
+  "Tree-sitter indent rules.")
+
+(defvar java-ts-mode--keywords
+  '("abstract" "assert" "break" "case" "catch"
+    "class" "continue" "default" "do" "else"
+    "enum" "exports" "extends" "final" "finally"
+    "for" "if" "implements" "import" "instanceof"
+    "interface" "module" "native" "new" "non-sealed"
+    "open" "opens" "package" "private" "protected"
+    "provides" "public" "requires" "return" "sealed"
+    "static" "strictfp" "switch" "synchronized"
+    "throw" "throws" "to" "transient" "transitive"
+    "try" "uses" "volatile" "while" "with" "record")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar java-ts-mode--operators
+  '("@" "+" ":" "++" "-" "--" "&" "&&" "|" "||"
+    "!=" "==" "*" "/" "%" "<" "<=" ">" ">=" "="
+    "-=" "+=" "*=" "/=" "%=" "->" "^" "^=" "&="
+    "|=" "~" ">>" ">>>" "<<" "::" "?")
+  "C operators for tree-sitter font-locking.")
+
+(defvar java-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'java
+   :override t
+   :feature 'basic
+   '((identifier) @font-lock-variable-name-face)
+   :language 'java
+   :override t
+   :feature 'comment
+   `((line_comment) @font-lock-comment-face
+     (block_comment) @font-lock-comment-face)
+   :language 'java
+   :override t
+   :feature 'constant
+   `(((identifier) @font-lock-constant-face
+      (:match "^[A-Z_][A-Z_\\d]*$" @font-lock-constant-face))
+     (true) @font-lock-constant-face
+     (false) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'keyword
+   `([,@java-ts-mode--keywords] @font-lock-keyword-face
+     (labeled_statement
+      (identifier) @font-lock-keyword-face))
+   :language 'java
+   :override t
+   :feature 'operator
+   `([,@java-ts-mode--operators] @font-lock-builtin-face)
+   :language 'java
+   :override t
+   :feature 'annotation
+   `((annotation
+      name: (identifier) @font-lock-constant-face)
+
+     (marker_annotation
+      name: (identifier) @font-lock-constant-face))
+   :language 'java
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face)
+   :language 'java
+   :override t
+   :feature 'literal
+   `((null_literal) @font-lock-constant-face
+     (decimal_floating_point_literal)  @font-lock-constant-face
+     (hex_floating_point_literal) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'type
+   '((interface_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (class_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (record_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (enum_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (constructor_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (field_access
+      object: (identifier) @font-lock-type-face)
+
+     (method_reference (identifier) @font-lock-type-face)
+
+     ((scoped_identifier name: (identifier) @font-lock-type-face)
+      (:match "^[A-Z]" @font-lock-type-face))
+
+     (type_identifier) @font-lock-type-face
+
+     [(boolean_type)
+      (integral_type)
+      (floating_point_type)
+      (void_type)] @font-lock-type-face)
+   :language 'java
+   :override t
+   :feature 'definition
+   `((method_declaration
+      name: (identifier) @font-lock-function-name-face)
+
+     (formal_parameter
+      name: (identifier) @font-lock-variable-name-face)
+
+     (catch_formal_parameter
+      name: (identifier) @font-lock-variable-name-face))
+   :language 'java
+   :override t
+   :feature 'expression
+   '((method_invocation
+      object: (identifier) @font-lock-variable-name-face)
+
+     (method_invocation
+      name: (identifier) @font-lock-function-name-face)
+
+     (argument_list (identifier) @font-lock-variable-name-face)))
+  "Tree-sitter font-lock settings.")
+
+(defun java-ts-mode--imenu-1 (node)
+  "Helper for `java-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'java-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun java-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_declaration"
+                             "interface_declaration"
+                             "enum_declaration"
+                             "record_declaration"
+                             "method_declaration")))))
+    (java-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode java-ts-mode prog-mode "Java"
+  "Major mode for editing Java, powered by Tree Sitter."
+  :group 'c
+  :syntax-table java-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'java)
+    (error "Tree-sitter for Java isn't available"))
+
+  (treesit-parser-create 'java)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules java-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "declaration")))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings java-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((basic comment keyword constant string operator)
+                (type definition expression literal annotation)
+                ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'java-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'java-ts-mode)
+
+;;; java-ts-mode.el ends here
diff --git a/lisp/progmodes/json-ts-mode.el b/lisp/progmodes/json-ts-mode.el
new file mode 100644
index 0000000000..13eb5b78a9
--- /dev/null
+++ b/lisp/progmodes/json-ts-mode.el
@@ -0,0 +1,150 @@
+;;; json-ts-mode.el --- tree sitter support for JSON  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : json languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom json-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `json-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'json)
+
+(defvar json-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?$ "_"      table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?` "\""     table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `json-ts-mode'.")
+
+
+(defvar json-ts--indent-rules
+  `((json
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "object") parent-bol json-ts-mode-indent-offset))))
+
+(defvar json-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'json
+   :feature 'minimal
+   :override t
+   `((pair
+      key: (_) @font-lock-string-face)
+
+     (string) @font-lock-string-face
+
+     (number) @font-lock-constant-face
+
+     [(null) (true) (false)] @font-lock-constant-face
+
+     (escape_sequence) @font-lock-constant-face
+
+     (comment) @font-lock-comment-face))
+  "Font-lock settings for JSON.")
+
+(defun json-ts-mode--imenu-1 (node)
+  "Helper for `json-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'json-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (treesit-node-text
+                  (treesit-node-child-by-field-name
+                   ts-node "key")
+                  t)))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun json-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node "pair")))
+    (json-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode json-ts-mode prog-mode "JSON"
+  "Major mode for editing JSON, powered by Tree Sitter."
+  :group 'json
+  :syntax-table json-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'json)
+    (error "Tree Sitter for JSON isn't available"))
+
+  (treesit-parser-create 'json)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules json-ts--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "pair" "object")))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings json-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((minimal) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'json-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'json-ts-mode)
+
+;;; json-ts-mode.el ends here
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-11  5:50   ` Theodor Thornhill
@ 2022-11-11 13:37     ` Stefan Monnier
  2022-11-11 15:09       ` Theodor Thornhill
  2022-11-11 15:54     ` Randy Taylor
  1 sibling, 1 reply; 83+ messages in thread
From: Stefan Monnier @ 2022-11-11 13:37 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: Randy Taylor, emacs-devel, casouri

> Yeah, I'm interested in reducing duplication, but not for that reason
> only.  But I'm thinking of ways to make one inherit the other.

I like inheritance between modes, but I recommend that you never inherit
from a "normal mode": modes should be either designed for inheritance
(`prog-mode`, `text-mode`, `special-mode`, `tex-mode`, ...) or be
actually used in buffers, but preferably not both.

So better create a "c-like" parent mode from which both `c-ts-mode` and
`c++-ts-mode` inherit than have one inherit from the other.


        Stefan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-11 13:37     ` Stefan Monnier
@ 2022-11-11 15:09       ` Theodor Thornhill
  0 siblings, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-11 15:09 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Randy Taylor, emacs-devel, casouri



On 11 November 2022 14:37:02 CET, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> Yeah, I'm interested in reducing duplication, but not for that reason
>> only.  But I'm thinking of ways to make one inherit the other.
>
>I like inheritance between modes, but I recommend that you never inherit
>from a "normal mode": modes should be either designed for inheritance
>(`prog-mode`, `text-mode`, `special-mode`, `tex-mode`, ...) or be
>actually used in buffers, but preferably not both.
>
>So better create a "c-like" parent mode from which both `c-ts-mode` and
>`c++-ts-mode` inherit than have one inherit from the other.
>
>
>        Stefan
>

Sounds like a plan!
Theo



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-11  5:50   ` Theodor Thornhill
  2022-11-11 13:37     ` Stefan Monnier
@ 2022-11-11 15:54     ` Randy Taylor
  2022-11-13  8:37       ` Theodor Thornhill
  1 sibling, 1 reply; 83+ messages in thread
From: Randy Taylor @ 2022-11-11 15:54 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel, casouri

On Friday, November 11th, 2022 at 00:50, Theodor Thornhill <theo@thornhill.no> wrote:
>
> Yeah, I'm interested in reducing duplication, but not for that reason only. But I'm thinking of ways to make one inherit the other.
> 
> How about we first get this merged, then new faces on top of that?

Sounds good.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-11  6:01   ` Theodor Thornhill via Emacs development discussions.
@ 2022-11-12  5:43     ` Yuan Fu
  2022-11-12  6:13       ` Po Lu
                         ` (3 more replies)
  2022-11-12 12:21     ` Eli Zaretskii
  1 sibling, 4 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-12  5:43 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel



> On Nov 10, 2022, at 10:01 PM, Theodor Thornhill <theo@thornhill.no> wrote:
> 
> Yuan Fu <casouri@gmail.com> writes:
> 
>>> On Nov 10, 2022, at 9:45 AM, Theodor Thornhill <theo@thornhill.no> wrote:
>>> 
>>> 
>>> 
>>> Hi all,
>>> 
>>> See the attached patch for support for several C-like languages.
>>> 
>>> They all support:
>>> - Font locking
>>> - Indentation (with styles for c/c++)
>>> - Movement
>>> - Imenu
>>> - Which-func
>>> 
>>> These modes are meant as a supplement to tree-sitter.
>>> 
>>> I'm hopeful for some constructive criticism, and some testing.  This
>>> patch needs to be applied to the feature/tree-sitter branch, and should
>>> hopefully be applied there before we merge the branch to master, well
>>> before Emacs 29 is cut.
>>> 
>>> I hope you like it,
>>> 
>>> Theo
>> 
>> This is fantastic! I’m trying them out right now :-)
>> 
>> Some things I noticed:
>> 
>> The indentation for the closing bracket of a struct is off:
>> 
>> struct regexp_cache
>> {
>>  struct regexp_cache *next;
>>  };
>> 
> 
> Yeah.
> 
>> Imenu has some duplicate entries, the patch below should fix that.
>> 
>> I also added the new contextual thingy to font-lock settings.
> 
> Thanks, added!
> 
> See below patch.  The struct issue was an ordering problem when I
> extracted the styles.  Should be good now
> 
> Theo
> 
> <0001-Add-Tree-Sitter-modes-for-C-like-languages.patch>

I noticed that with the default indent style, Emacs indents like this:

int main () {
              for (int j = 0; j < 5; j++)
                a[j] = 3;
              int i = 1;
              swap(i, a[i+1]);
              Point p = {0, 1};
}

int main ()
{
  for (int j = 0; j < 5; j++)
    a[j] = 3;
  int i = 1;
  swap(i, a[i+1]);
  Point p = {0, 1};
}

Is this expected?

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  5:43     ` Yuan Fu
@ 2022-11-12  6:13       ` Po Lu
  2022-11-12  6:17         ` Yuan Fu
  2022-11-12  6:16       ` Theodor Thornhill
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 83+ messages in thread
From: Po Lu @ 2022-11-12  6:13 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Theodor Thornhill, emacs-devel

Yuan Fu <casouri@gmail.com> writes:

> int main ()
> {
>   for (int j = 0; j < 5; j++)
>     a[j] = 3;
>   int i = 1;
>   swap(i, a[i+1]);
>   Point p = {0, 1};
> }
>
> Is this expected?

Yes? c-basic-offset is 2 by default.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  5:43     ` Yuan Fu
  2022-11-12  6:13       ` Po Lu
@ 2022-11-12  6:16       ` Theodor Thornhill
  2022-11-12  6:25         ` Yuan Fu
  2022-11-12  8:08         ` Eli Zaretskii
  2022-11-12  7:22       ` Theodor Thornhill via Emacs development discussions.
  2022-11-12  8:05       ` Eli Zaretskii
  3 siblings, 2 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-12  6:16 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel



On 12 November 2022 06:43:21 CET, Yuan Fu <casouri@gmail.com> wrote:
>
>
>> On Nov 10, 2022, at 10:01 PM, Theodor Thornhill <theo@thornhill.no> wrote:
>> 
>> Yuan Fu <casouri@gmail.com> writes:
>> 
>>>> On Nov 10, 2022, at 9:45 AM, Theodor Thornhill <theo@thornhill.no> wrote:
>>>> 
>>>> 
>>>> 
>>>> Hi all,
>>>> 
>>>> See the attached patch for support for several C-like languages.
>>>> 
>>>> They all support:
>>>> - Font locking
>>>> - Indentation (with styles for c/c++)
>>>> - Movement
>>>> - Imenu
>>>> - Which-func
>>>> 
>>>> These modes are meant as a supplement to tree-sitter.
>>>> 
>>>> I'm hopeful for some constructive criticism, and some testing.  This
>>>> patch needs to be applied to the feature/tree-sitter branch, and should
>>>> hopefully be applied there before we merge the branch to master, well
>>>> before Emacs 29 is cut.
>>>> 
>>>> I hope you like it,
>>>> 
>>>> Theo
>>> 
>>> This is fantastic! I’m trying them out right now :-)
>>> 
>>> Some things I noticed:
>>> 
>>> The indentation for the closing bracket of a struct is off:
>>> 
>>> struct regexp_cache
>>> {
>>>  struct regexp_cache *next;
>>>  };
>>> 
>> 
>> Yeah.
>> 
>>> Imenu has some duplicate entries, the patch below should fix that.
>>> 
>>> I also added the new contextual thingy to font-lock settings.
>> 
>> Thanks, added!
>> 
>> See below patch.  The struct issue was an ordering problem when I
>> extracted the styles.  Should be good now
>> 
>> Theo
>> 
>> <0001-Add-Tree-Sitter-modes-for-C-like-languages.patch>
>
>I noticed that with the default indent style, Emacs indents like this:
>
>int main () {
>              for (int j = 0; j < 5; j++)
>                a[j] = 3;
>              int i = 1;
>              swap(i, a[i+1]);
>              Point p = {0, 1};
>}
>
>int main ()
>{
>  for (int j = 0; j < 5; j++)
>    a[j] = 3;
>  int i = 1;
>  swap(i, a[i+1]);
>  Point p = {0, 1};
>}
>
>Is this expected?
>
>Yuan

The first one isn't gnu style, right? But it's an easy fix :)



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  6:13       ` Po Lu
@ 2022-11-12  6:17         ` Yuan Fu
  2022-11-12  6:43           ` Po Lu
  0 siblings, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-12  6:17 UTC (permalink / raw)
  To: Po Lu; +Cc: Theodor Thornhill, emacs-devel



> On Nov 11, 2022, at 10:13 PM, Po Lu <luangruo@yahoo.com> wrote:
> 
> Yuan Fu <casouri@gmail.com> writes:
> 
>> int main ()
>> {
>>  for (int j = 0; j < 5; j++)
>>    a[j] = 3;
>>  int i = 1;
>>  swap(i, a[i+1]);
>>  Point p = {0, 1};
>> }
>> 
>> Is this expected?
> 
> Yes? c-basic-offset is 2 by default.

I should clarify, indentation of the second snippet is not suprising, I just wanted to show the difference between the two. The first snippet is what I didn’t expect.

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  6:16       ` Theodor Thornhill
@ 2022-11-12  6:25         ` Yuan Fu
  2022-11-12  6:37           ` Theodor Thornhill
  2022-11-12  8:08         ` Eli Zaretskii
  1 sibling, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-12  6:25 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel

>> 
>> I noticed that with the default indent style, Emacs indents like this:
>> 
>> int main () {
>>             for (int j = 0; j < 5; j++)
>>               a[j] = 3;
>>             int i = 1;
>>             swap(i, a[i+1]);
>>             Point p = {0, 1};
>> }
>> 
>> int main ()
>> {
>> for (int j = 0; j < 5; j++)
>>   a[j] = 3;
>> int i = 1;
>> swap(i, a[i+1]);
>> Point p = {0, 1};
>> }
>> 
>> Is this expected?
>> 
>> Yuan
> 
> The first one isn't gnu style, right? But it's an easy fix :)

Thanks. It isn’t gnu style, but it would be nice if the first snippet still indents normally. (So that I don’t need to set the indent style for different projects.)

Yuan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  6:25         ` Yuan Fu
@ 2022-11-12  6:37           ` Theodor Thornhill
  0 siblings, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-12  6:37 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel



On 12 November 2022 07:25:13 CET, Yuan Fu <casouri@gmail.com> wrote:
>>> 
>>> I noticed that with the default indent style, Emacs indents like this:
>>> 
>>> int main () {
>>>             for (int j = 0; j < 5; j++)
>>>               a[j] = 3;
>>>             int i = 1;
>>>             swap(i, a[i+1]);
>>>             Point p = {0, 1};
>>> }
>>> 
>>> int main ()
>>> {
>>> for (int j = 0; j < 5; j++)
>>>   a[j] = 3;
>>> int i = 1;
>>> swap(i, a[i+1]);
>>> Point p = {0, 1};
>>> }
>>> 
>>> Is this expected?
>>> 
>>> Yuan
>> 
>> The first one isn't gnu style, right? But it's an easy fix :)
>
>Thanks. It isn’t gnu style, but it would be nice if the first snippet still indents normally. (So that I don’t need to set the indent style for different projects.)
>
>Yuan
>

I agree! I'll fix it :)



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  6:17         ` Yuan Fu
@ 2022-11-12  6:43           ` Po Lu
  0 siblings, 0 replies; 83+ messages in thread
From: Po Lu @ 2022-11-12  6:43 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Theodor Thornhill, emacs-devel

Yuan Fu <casouri@gmail.com> writes:

> I should clarify, indentation of the second snippet is not suprising,
> I just wanted to show the difference between the two. The first
> snippet is what I didn’t expect.

Ah.  Then, sorry for the noise.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  5:43     ` Yuan Fu
  2022-11-12  6:13       ` Po Lu
  2022-11-12  6:16       ` Theodor Thornhill
@ 2022-11-12  7:22       ` Theodor Thornhill via Emacs development discussions.
  2022-11-12  8:05       ` Eli Zaretskii
  3 siblings, 0 replies; 83+ messages in thread
From: Theodor Thornhill via Emacs development discussions. @ 2022-11-12  7:22 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 440 bytes --]

>
> I noticed that with the default indent style, Emacs indents like this:
>
> int main () {
>               for (int j = 0; j < 5; j++)
>                 a[j] = 3;
>               int i = 1;
>               swap(i, a[i+1]);
>               Point p = {0, 1};
> }
>
> int main ()
> {
>   for (int j = 0; j < 5; j++)
>     a[j] = 3;
>   int i = 1;
>   swap(i, a[i+1]);
>   Point p = {0, 1};
> }
>
> Is this expected?

Try this one instead:



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-Tree-Sitter-modes-for-C-like-languages.patch --]
[-- Type: text/x-diff, Size: 56651 bytes --]

From dbcd64db88a665393297996fba165b8adc66ba50 Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Thu, 10 Nov 2022 17:15:49 +0100
Subject: [PATCH] Add Tree Sitter modes for C-like languages

* etc/NEWS: Mention the new modes
* lisp/progmodes/c++-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/c-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/java-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/json-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/css-ts-mode.el: New major mode with Tree Sitter support.
---
 etc/NEWS                       |  28 ++-
 lisp/progmodes/c++-ts-mode.el  | 424 +++++++++++++++++++++++++++++++++
 lisp/progmodes/c-ts-mode.el    | 414 ++++++++++++++++++++++++++++++++
 lisp/progmodes/css-ts-mode.el  | 131 ++++++++++
 lisp/progmodes/java-ts-mode.el | 289 ++++++++++++++++++++++
 lisp/progmodes/json-ts-mode.el | 150 ++++++++++++
 6 files changed, 1434 insertions(+), 2 deletions(-)
 create mode 100644 lisp/progmodes/c++-ts-mode.el
 create mode 100644 lisp/progmodes/c-ts-mode.el
 create mode 100644 lisp/progmodes/css-ts-mode.el
 create mode 100644 lisp/progmodes/java-ts-mode.el
 create mode 100644 lisp/progmodes/json-ts-mode.el

diff --git a/etc/NEWS b/etc/NEWS
index 9ed78bc6b3..3ce9810ece 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -2786,8 +2786,32 @@ when visiting JSON files.
 ** New mode ts-mode'.
 Support is added for TypeScript, based on the new integration with
 Tree-Sitter. There's support for font-locking, indentation and
-navigation.  Tree-Sitter is required for this mode to function, but if
-it is not available, we will default to use 'js-mode'.
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode c-ts-mode'.
+Support is added for C, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode c++-ts-mode'.
+Support is added for c++, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode java-ts-mode'.
+Support is added for Java, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode css-ts-mode'.
+Support is added for CSS, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
+
+** New mode json-ts-mode'.
+Support is added for JSON, based on the new integration with
+Tree-Sitter. There's support for font-locking, indentation and
+navigation.  Tree-Sitter is required for this mode to function.
 
 \f
 * Incompatible Lisp Changes in Emacs 29.1
diff --git a/lisp/progmodes/c++-ts-mode.el b/lisp/progmodes/c++-ts-mode.el
new file mode 100644
index 0000000000..7e3bb7d141
--- /dev/null
+++ b/lisp/progmodes/c++-ts-mode.el
@@ -0,0 +1,424 @@
+;;; c++-ts-mode.el --- tree sitter support for C++  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : c++ languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom c++-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `c++-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'cpp)
+
+(defcustom c++-ts-mode-indent-style 'gnu
+  "Style used for indentation.
+
+The selected style could be one of GNU, K&R, LINUX or BSD.  If
+one of the supplied styles doesn't suffice a function could be
+set instead.  This function is expected return a list that
+follows the form of `treesit-simple-indent-rules'."
+  :type '(choice (symbol :tag "Gnu" 'gnu)
+                 (symbol :tag "K&R" 'k&r)
+                 (symbol :tag "Linux" 'linux)
+                 (symbol :tag "BSD" 'bsd)
+                 (function :tag "A function for user customized style" ignore))
+  :group 'cpp)
+
+(defvar c++-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    (modify-syntax-entry ?/  ". 124b" table)
+    (modify-syntax-entry ?*  ". 23"   table)
+    table)
+  "Syntax table for `c++-ts-mode'.")
+
+(defvar c++-ts-mode--indent-styles
+  (let ((common
+         `(((parent-is "translation_unit") parent-bol 0)
+           ((node-is ")") parent 1)
+           ((node-is "]") parent-bol 0)
+           ((node-is "else") parent-bol 0)
+           ((node-is "case") parent-bol 0)
+           ((node-is "comment") no-indent)
+           ((parent-is "comment") no-indent)
+           ((node-is "labeled_statement") parent-bol 0)
+           ((parent-is "labeled_statement") parent-bol c++-ts-mode-indent-offset)
+           ((match "preproc_ifdef" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_ifdef") point-min 0)
+           ((match "preproc_if" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_if") point-min 0)
+           ((match "preproc_function_def" "compound_statement") point-min 0)
+           ((match "preproc_call" "compound_statement") point-min 0)
+           ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+           ((node-is "field_initializer_list") parent-bol ,(* c++-ts-mode-indent-offset 2))
+           ((parent-is "function_definition") parent-bol 0)
+           ((parent-is "conditional_expression") first-sibling 0)
+           ((parent-is "assignment_expression") parent-bol c++-ts-mode-indent-offset)
+           ((parent-is "comma_expression") first-sibling 0)
+           ((parent-is "init_declarator") parent-bol c++-ts-mode-indent-offset)
+           ((parent-is "parenthesized_expression") first-sibling 1)
+           ((parent-is "argument_list") first-sibling 1)
+           ((parent-is "parameter_list") first-sibling 1)
+           ((parent-is "binary_expression") parent 0)
+           ((query "(for_statement initializer: (_) @indent)") parent-bol 5)
+           ((query "(for_statement condition: (_) @indent)") parent-bol 5)
+           ((query "(for_statement update: (_) @indent)") parent-bol 5)
+           ((query "(call_expression arguments: (_) @indent)") parent c++-ts-mode-indent-offset))))
+    `((gnu
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((match "while" "do_statement") parent 0)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (k&r
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (linux
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c++-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (bsd
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent 0)
+       ((parent-is "if_statement") parent-bol 0)
+       ((parent-is "for_statement") parent-bol 0)
+       ((parent-is "while_statement") parent-bol 0)
+       ((parent-is "switch_statement") parent-bol 0)
+       ((parent-is "case_statement") parent-bol 0)
+       ((parent-is "do_statement") parent-bol 0)
+       (no-node parent-bol c++-ts-mode-indent-offset))))
+  "Indent rules supported by `c++-ts-mode'.")
+
+(defun c++-ts-mode--set-indent-style ()
+  "Helper function to set indentation style."
+  (let ((style
+         (if (functionp c++-ts-mode-indent-style)
+             (funcall c++-ts-mode-indent-style)
+           (pcase c++-ts-mode-indent-style
+             ('gnu   (alist-get 'gnu c++-ts-mode--indent-styles))
+             ('k&r   (alist-get 'k&r c++-ts-mode--indent-styles))
+             ('bsd   (alist-get 'bsd c++-ts-mode--indent-styles))
+             ('linux (alist-get 'linux c++-ts-mode--indent-styles))))))
+    `((cpp ,@style))))
+
+(defvar c++-ts-mode--keywords
+  '("and" "and_eq" "bitand" "bitor" "catch" "class" "co_await"
+    "co_return" "co_yield" "compl" "concept" "consteval" "constexpr"
+    "constinit" "decltype" "delete" "else" "explicit" "final" "for"
+    "friend" "friend" "if" "mutable" "namespace" "new" "noexcept"
+    "not" "not_eq" "operator" "or" "or_eq" "override" "private"
+    "protected" "public" "requires" "return" "static" "struct"
+    "template" "throw" "try" "typename" "using" "virtual" "xor" "xor_eq"
+    "switch" "case")
+  "C++ keywords for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--preproc-keywords
+  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif" "#include")
+  "C++ keywords for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--operators
+  '("=" "-" "*" "/" "+" "%" "~" "|" "&" "^" "<<" ">>" "->"
+    "." "<" "<=" ">=" ">" "==" "!=" "!" "&&" "||" "-="
+    "+=" "*=" "/=" "%=" "|=" "&=" "^=" ">>=" "<<=" "--" "++")
+  "C++ operators for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'cpp
+   :override t
+   :feature 'comment
+   `((comment) @font-lock-comment-face
+     (comment) @contexual)
+   :language 'cpp
+   :override t
+   :feature 'preprocessor
+   `((preproc_directive) @font-lock-preprocessor-face
+
+     (preproc_def
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_ifdef
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_function_def
+      name: (identifier) @font-lock-function-name-face)
+
+     (preproc_params
+      (identifier) @font-lock-variable-name-face)
+
+     (preproc_defined) @font-lock-preprocessor-face
+     (preproc_defined (identifier) @font-lock-variable-name-face)
+     [,@c++-ts-mode--preproc-keywords] @font-lock-preprocessor-face)
+   :language 'cpp
+   :override t
+   :feature 'constant
+   `((true) @font-lock-constant-face
+     (false) @font-lock-constant-face
+     (null) @font-lock-constant-face
+     (this) @font-lock-constant-face)
+   :language 'cpp
+   :override t
+   :feature 'keyword
+   `([,@c++-ts-mode--keywords] @font-lock-keyword-face
+     (auto) @font-lock-keyword-face)
+   :language 'cpp
+   :override t
+   :feature 'operator
+   `([,@c++-ts-mode--operators] @font-lock-builtin-face)
+   :language 'cpp
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face
+     ((string_literal)) @contextual
+     (system_lib_string) @font-lock-string-face
+     (escape_sequence) @font-lock-string-face)
+   :language 'cpp
+   :override t
+   :feature 'literal
+   `((number_literal) @font-lock-constant-face
+     (char_literal) @font-lock-constant-face)
+   :language 'cpp
+   :override t
+   :feature 'type
+   '((primitive_type) @font-lock-type-face
+     (type_qualifier) @font-lock-type-face
+
+     (qualified_identifier
+      scope: (namespace_identifier) @font-lock-type-face)
+
+     (operator_cast)  type: (type_identifier) @font-lock-type-face)
+   :language 'cpp
+   :override t
+   :feature 'definition
+   `((declaration
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (field_declaration
+      declarator: (field_identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_definition
+      type: (type_identifier) @font-lock-function-name-face)
+
+     (function_declarator
+      declarator: (identifier) @font-lock-function-name-face)
+
+     (array_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (init_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (struct_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (sized_type_specifier) @font-lock-type-face
+
+     (enum_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (enumerator
+      name: (identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (_) @font-lock-type-face
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (field_identifier) @font-lock-variable-name-face))
+   :language 'cpp
+   :override t
+   :feature 'expression
+   '((assignment_expression
+      left: (identifier) @font-lock-variable-name-face)
+
+     (call_expression
+      function: (identifier) @font-lock-function-name-face)
+
+     (field_expression
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (field_expression
+      argument: (identifier) @font-lock-variable-name-face
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (pointer_expression
+      argument: (identifier) @font-lock-variable-name-face))
+   :language 'cpp
+   :override t
+   :feature 'statement
+   '((expression_statement (identifier) @font-lock-variable-name-face)
+     (labeled_statement
+      label: (statement_identifier) @font-lock-type-face))
+   :language 'cpp
+   :override t
+   :feature 'error
+   '((ERROR) @font-lock-warning-face))
+  "Tree-sitter font-lock settings.")
+
+(defun c++-ts-mode--imenu-1 (node)
+  "Helper for `c++-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'c++-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "declarator")
+                          (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    ;; A struct_specifier could be inside a parameter list or another
+    ;; struct definition.  In those cases we don't include it.
+    (cond
+     ((string-match-p
+       (rx (or "parameter" "field") "_declaration")
+       (or (treesit-node-type (treesit-node-parent ts-node))
+           ""))
+      nil)
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun c++-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "function_definition"
+                             "struct_specifier")))))
+    (c++-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode c++-ts-mode prog-mode "C++"
+  "Major mode for editing C++, powered by Tree Sitter."
+  :group 'cpp
+  :syntax-table c++-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'cpp)
+    (error "Tree Sitter for C++ isn't available"))
+
+  (treesit-parser-create 'cpp)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (when (eq c++-ts-mode-indent-style 'linux)
+    (setq-local indent-tabs-mode t))
+  (setq-local treesit-simple-indent-rules
+              (c++-ts-mode--set-indent-style))
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "specifier"
+                      "definition")))
+
+  ;; Electric
+  (setq-local electric-indent-chars
+	      (append "{}():;," electric-indent-chars))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings c++-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((comment preprocessor operator constant string literal keyword)
+                (type definition expression statement)
+                (error)))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'c++-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'c++-ts-mode)
+
+;;; c++-ts-mode.el ends here
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
new file mode 100644
index 0000000000..d99d863d60
--- /dev/null
+++ b/lisp/progmodes/c-ts-mode.el
@@ -0,0 +1,414 @@
+;;; c-ts-mode.el --- tree sitter support for C  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : c languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom c-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `c-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'c)
+
+(defcustom c-ts-mode-indent-style 'gnu
+  "Style used for indentation.
+
+The selected style could be one of GNU, K&R, LINUX or BSD.  If
+one of the supplied styles doesn't suffice a function could be
+set instead.  This function is expected return a list that
+follows the form of `treesit-simple-indent-rules'."
+  :type '(choice (symbol :tag "Gnu" 'gnu)
+                 (symbol :tag "K&R" 'k&r)
+                 (symbol :tag "Linux" 'linux)
+                 (symbol :tag "BSD" 'bsd)
+                 (function :tag "A function for user customized style" ignore))
+  :group 'c)
+
+(defvar c-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    (modify-syntax-entry ?/  ". 124b" table)
+    (modify-syntax-entry ?*  ". 23"   table)
+    table)
+  "Syntax table for `c-ts-mode'.")
+
+(defvar c-ts-mode--indent-styles
+  (let ((common
+         `(((parent-is "translation_unit") parent-bol 0)
+           ((node-is ")") parent 1)
+           ((node-is "]") parent-bol 0)
+           ((node-is "else") parent-bol 0)
+           ((node-is "case") parent-bol 0)
+           ((node-is "comment") no-indent)
+           ((parent-is "comment") no-indent)
+           ((node-is "labeled_statement") parent-bol 0)
+           ((parent-is "labeled_statement") parent-bol c-ts-mode-indent-offset)
+           ((match "preproc_ifdef" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_ifdef") point-min 0)
+           ((match "preproc_if" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_if") point-min 0)
+           ((match "preproc_function_def" "compound_statement") point-min 0)
+           ((match "preproc_call" "compound_statement") point-min 0)
+           ((parent-is "function_definition") parent-bol 0)
+           ((parent-is "conditional_expression") first-sibling 0)
+           ((parent-is "assignment_expression") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "comma_expression") first-sibling 0)
+           ((parent-is "init_declarator") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "parenthesized_expression") first-sibling 1)
+           ((parent-is "argument_list") first-sibling 1)
+           ((parent-is "parameter_list") first-sibling 1)
+           ((parent-is "binary_expression") parent 0)
+           ((query "(for_statement initializer: (_) @indent)") parent-bol 5)
+           ((query "(for_statement condition: (_) @indent)") parent-bol 5)
+           ((query "(for_statement update: (_) @indent)") parent-bol 5)
+           ((query "(call_expression arguments: (_) @indent)") parent c-ts-mode-indent-offset)
+           ((parent-is "call_expression") parent 0))))
+    `((gnu
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+       ((match "while" "do_statement") parent 0)
+       ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+       (no-node parent-bol c-ts-mode-indent-offset))
+      (k&r
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+       (no-node parent-bol c-ts-mode-indent-offset))
+      (linux
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+       (no-node parent-bol c-ts-mode-indent-offset))
+      (bsd
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent 0)
+       ((parent-is "if_statement") parent-bol 0)
+       ((parent-is "for_statement") parent-bol 0)
+       ((parent-is "while_statement") parent-bol 0)
+       ((parent-is "switch_statement") parent-bol 0)
+       ((parent-is "case_statement") parent-bol 0)
+       ((parent-is "do_statement") parent-bol 0)
+       (no-node parent-bol c-ts-mode-indent-offset))))
+  "Indent rules supported by `c-ts-mode'.")
+
+(defun c-ts-mode--set-indent-style ()
+  "Helper function to set indentation style."
+  (let ((style
+         (if (functionp c-ts-mode-indent-style)
+             (funcall c-ts-mode-indent-style)
+           (pcase c-ts-mode-indent-style
+             ('gnu   (alist-get 'gnu c-ts-mode--indent-styles))
+             ('k&r   (alist-get 'k&r c-ts-mode--indent-styles))
+             ('bsd   (alist-get 'bsd c-ts-mode--indent-styles))
+             ('linux (alist-get 'linux c-ts-mode--indent-styles))))))
+    `((c ,@style))))
+
+(defvar c-ts-mode--keywords
+  '("const" "default" "enum" "extern" "inline" "static"
+    "struct" "typedef" "union" "volatile" "goto" "register"
+    "sizeof" "return"
+    "while" "for" "do" "continue" "break"
+    "if" "else" "case" "switch")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar c-ts-mode--preproc-keywords
+  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif"
+    "#include")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar c-ts-mode--operators
+  '("=" "-" "*" "/" "+" "%" "~" "|" "&" "^" "<<" ">>" "->"
+    "." "<" "<=" ">=" ">" "==" "!=" "!" "&&" "||" "-="
+    "+=" "*=" "/=" "%=" "|=" "&=" "^=" ">>=" "<<=" "--" "++")
+  "C operators for tree-sitter font-locking.")
+
+(defvar c-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'c
+   :override t
+   :feature 'comment
+   `((comment) @font-lock-comment-face
+     (comment) @contexual)
+   :language 'c
+   :override t
+   :feature 'preprocessor
+   `((preproc_directive) @font-lock-preprocessor-face
+
+     (preproc_def
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_ifdef
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_function_def
+      name: (identifier) @font-lock-function-name-face)
+
+     (preproc_params
+      (identifier) @font-lock-variable-name-face)
+
+     (preproc_defined) @font-lock-preprocessor-face
+     (preproc_defined (identifier) @font-lock-variable-name-face)
+     [,@c-ts-mode--preproc-keywords] @font-lock-preprocessor-face)
+   :language 'c
+   :override t
+   :feature 'constant
+   `((true) @font-lock-constant-face
+     (false) @font-lock-constant-face
+     (null) @font-lock-constant-face)
+   :language 'c
+   :override t
+   :feature 'keyword
+   `([,@c-ts-mode--keywords] @font-lock-keyword-face)
+   :language 'c
+   :override t
+   :feature 'operator
+   `([,@c-ts-mode--operators] @font-lock-builtin-face)
+   :language 'c
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face
+     ((string_literal)) @contextual
+     (system_lib_string) @font-lock-string-face
+     (escape_sequence) @font-lock-string-face)
+   :language 'c
+   :override t
+   :feature 'literal
+   `((number_literal) @font-lock-constant-face
+     (char_literal) @font-lock-constant-face)
+   :language 'c
+   :override t
+   :feature 'type
+   '((primitive_type) @font-lock-type-face)
+   :language 'c
+   :override t
+   :feature 'definition
+   `((declaration
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (field_declaration
+      declarator: (field_identifier) @font-lock-variable-name-face)
+
+     (field_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (parameter_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_definition
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_declarator
+      declarator: (identifier) @font-lock-function-name-face)
+
+     (array_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (init_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (struct_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (sized_type_specifier) @font-lock-type-face
+
+     (enum_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (enumerator
+      name: (identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (_) @font-lock-type-face
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (field_identifier) @font-lock-variable-name-face))
+   :language 'c
+   :override t
+   :feature 'expression
+   '((assignment_expression
+      left: (identifier) @font-lock-variable-name-face)
+
+     (call_expression
+      function: (identifier) @font-lock-function-name-face)
+
+     (field_expression
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (field_expression
+      argument: (identifier) @font-lock-variable-name-face
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (pointer_expression
+      argument: (identifier) @font-lock-variable-name-face))
+   :language 'c
+   :override t
+   :feature 'statement
+   '((expression_statement (identifier) @font-lock-variable-name-face))
+   :language 'c
+   :override t
+   :feature 'error
+   '((ERROR) @font-lock-warning-face))
+  "Tree-sitter font-lock settings.")
+
+(defun c-ts-mode--imenu-1 (node)
+  "Helper for `c-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'c-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "declarator")
+                          (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    ;; A struct_specifier could be inside a parameter list or another
+    ;; struct definition.  In those cases we don't include it.
+    (cond
+     ((string-match-p
+       (rx (or "parameter" "field") "_declaration")
+       (or (treesit-node-type (treesit-node-parent ts-node))
+           ""))
+      nil)
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun c-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "function_definition"
+                             "struct_specifier")))))
+    (c-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode c-ts-mode prog-mode "C"
+  "Major mode for editing C, powered by Tree Sitter."
+  :group 'c
+  :syntax-table c-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'c)
+    (error "Tree Sitter for C isn't available"))
+
+  (treesit-parser-create 'c)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (when (eq c-ts-mode-indent-style 'linux)
+    (setq-local indent-tabs-mode t))
+  (setq-local treesit-simple-indent-rules
+              (c-ts-mode--set-indent-style))
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "specifier"
+                      "definition")))
+
+  ;; Electric
+  (setq-local electric-indent-chars
+	      (append "{}():;," electric-indent-chars))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings c-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((comment preprocessor operator constant string literal keyword)
+                (type definition expression statement)
+                (error)))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'c-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'c-ts-mode)
+
+;;; c-ts-mode.el ends here
diff --git a/lisp/progmodes/css-ts-mode.el b/lisp/progmodes/css-ts-mode.el
new file mode 100644
index 0000000000..c1a8d4e94d
--- /dev/null
+++ b/lisp/progmodes/css-ts-mode.el
@@ -0,0 +1,131 @@
+;;; css-ts-mode.el --- tree sitter support for CSS  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : css languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+(require 'css-mode)
+
+(defcustom css-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'css)
+
+(defvar css-ts-mode--indent-rules
+  `((css
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+
+     ((parent-is "block") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "arguments") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "declaration") parent-bol css-ts-mode-indent-offset))))
+
+(defvar css-ts-mode--settings
+  (treesit-font-lock-rules
+   :language 'css
+   :feature 'basic
+   :override t
+   `((unit) @font-lock-constant-face
+     (integer_value) @font-lock-builtin-face
+     (float_value) @font-lock-builtin-face
+     (plain_value) @font-lock-variable-name-face
+     (comment) @font-lock-comment-face
+     (class_selector) @css-selector
+     (child_selector) @css-selector
+     (id_selector) @css-selector
+     (tag_name) @css-selector
+     (property_name) @css-property
+     (class_name) @css-selector
+     (function_name) @font-lock-function-name-face)))
+
+(defun css-ts-mode--imenu-1 (node)
+  "Helper for `css-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'css-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (if (equal (treesit-node-type ts-node) "tag_name")
+                     (treesit-node-text ts-node)
+                   (treesit-node-text (treesit-node-child ts-node 1) t))))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun css-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_selector"
+                             "id_selector"
+                             "tag_name")))))
+    (css-ts-mode--imenu-1 tree)))
+
+(define-derived-mode css-ts-mode prog-mode "CSS"
+  "Major mode for editing CSS."
+  :group 'css
+  :syntax-table css-mode-syntax-table
+
+  (unless (treesit-ready-p nil 'css)
+    (error "Tree Sitter for CSS isn't available"))
+
+  (treesit-parser-create 'css)
+
+  ;; Comments
+  (setq-local comment-start "/*")
+  (setq-local comment-start-skip "/\\*+[ \t]*")
+  (setq-local comment-end "*/")
+  (setq-local comment-end-skip "[ \t]*\\*+/")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules css-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp "rule_set")
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings css-ts-mode--settings)
+  (setq treesit-font-lock-feature-list '((basic) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'css-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'css-ts-mode)
+
+;;; css-ts-mode.el ends here
diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
new file mode 100644
index 0000000000..734a8be471
--- /dev/null
+++ b/lisp/progmodes/java-ts-mode.el
@@ -0,0 +1,289 @@
+;;; java-ts-mode.el --- tree sitter support for Java  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : java languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom java-ts-mode-indent-offset 4
+  "Number of spaces for each indentation step in `java-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'java)
+
+(defvar java-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `java-ts-mode'.")
+
+(defvar java-ts-mode--indent-rules
+  `((java
+     ((parent-is "program") parent-bol 0)
+     ((node-is "}") (and parent parent-bol) 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "class_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "interface_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "constructor_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "enum_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_block") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "record_declaration_body") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block _ @indent))") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block (_) @indent))") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "variable_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "method_invocation") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_rule") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "ternary_expression") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "element_value_array_initializer") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "function_definition") parent-bol 0)
+     ((parent-is "conditional_expression") first-sibling 0)
+     ((parent-is "assignment_expression") parent-bol 2)
+     ((parent-is "binary_expression") parent 0)
+     ((parent-is "parenthesized_expression") first-sibling 1)
+     ((parent-is "argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "annotation_argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "modifiers") parent-bol 0)
+     ((parent-is "formal_parameters") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "formal_parameter") parent-bol 0)
+     ((parent-is "init_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "if_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "for_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "while_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "case_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "labeled_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "do_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "block") (and parent parent-bol) java-ts-mode-indent-offset)))
+  "Tree-sitter indent rules.")
+
+(defvar java-ts-mode--keywords
+  '("abstract" "assert" "break" "case" "catch"
+    "class" "continue" "default" "do" "else"
+    "enum" "exports" "extends" "final" "finally"
+    "for" "if" "implements" "import" "instanceof"
+    "interface" "module" "native" "new" "non-sealed"
+    "open" "opens" "package" "private" "protected"
+    "provides" "public" "requires" "return" "sealed"
+    "static" "strictfp" "switch" "synchronized"
+    "throw" "throws" "to" "transient" "transitive"
+    "try" "uses" "volatile" "while" "with" "record")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar java-ts-mode--operators
+  '("@" "+" ":" "++" "-" "--" "&" "&&" "|" "||"
+    "!=" "==" "*" "/" "%" "<" "<=" ">" ">=" "="
+    "-=" "+=" "*=" "/=" "%=" "->" "^" "^=" "&="
+    "|=" "~" ">>" ">>>" "<<" "::" "?")
+  "C operators for tree-sitter font-locking.")
+
+(defvar java-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'java
+   :override t
+   :feature 'basic
+   '((identifier) @font-lock-variable-name-face)
+   :language 'java
+   :override t
+   :feature 'comment
+   `((line_comment) @font-lock-comment-face
+     (block_comment) @font-lock-comment-face)
+   :language 'java
+   :override t
+   :feature 'constant
+   `(((identifier) @font-lock-constant-face
+      (:match "^[A-Z_][A-Z_\\d]*$" @font-lock-constant-face))
+     (true) @font-lock-constant-face
+     (false) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'keyword
+   `([,@java-ts-mode--keywords] @font-lock-keyword-face
+     (labeled_statement
+      (identifier) @font-lock-keyword-face))
+   :language 'java
+   :override t
+   :feature 'operator
+   `([,@java-ts-mode--operators] @font-lock-builtin-face)
+   :language 'java
+   :override t
+   :feature 'annotation
+   `((annotation
+      name: (identifier) @font-lock-constant-face)
+
+     (marker_annotation
+      name: (identifier) @font-lock-constant-face))
+   :language 'java
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face)
+   :language 'java
+   :override t
+   :feature 'literal
+   `((null_literal) @font-lock-constant-face
+     (decimal_floating_point_literal)  @font-lock-constant-face
+     (hex_floating_point_literal) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'type
+   '((interface_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (class_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (record_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (enum_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (constructor_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (field_access
+      object: (identifier) @font-lock-type-face)
+
+     (method_reference (identifier) @font-lock-type-face)
+
+     ((scoped_identifier name: (identifier) @font-lock-type-face)
+      (:match "^[A-Z]" @font-lock-type-face))
+
+     (type_identifier) @font-lock-type-face
+
+     [(boolean_type)
+      (integral_type)
+      (floating_point_type)
+      (void_type)] @font-lock-type-face)
+   :language 'java
+   :override t
+   :feature 'definition
+   `((method_declaration
+      name: (identifier) @font-lock-function-name-face)
+
+     (formal_parameter
+      name: (identifier) @font-lock-variable-name-face)
+
+     (catch_formal_parameter
+      name: (identifier) @font-lock-variable-name-face))
+   :language 'java
+   :override t
+   :feature 'expression
+   '((method_invocation
+      object: (identifier) @font-lock-variable-name-face)
+
+     (method_invocation
+      name: (identifier) @font-lock-function-name-face)
+
+     (argument_list (identifier) @font-lock-variable-name-face)))
+  "Tree-sitter font-lock settings.")
+
+(defun java-ts-mode--imenu-1 (node)
+  "Helper for `java-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'java-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun java-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_declaration"
+                             "interface_declaration"
+                             "enum_declaration"
+                             "record_declaration"
+                             "method_declaration")))))
+    (java-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode java-ts-mode prog-mode "Java"
+  "Major mode for editing Java, powered by Tree Sitter."
+  :group 'c
+  :syntax-table java-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'java)
+    (error "Tree-sitter for Java isn't available"))
+
+  (treesit-parser-create 'java)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules java-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "declaration")))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings java-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((basic comment keyword constant string operator)
+                (type definition expression literal annotation)
+                ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'java-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'java-ts-mode)
+
+;;; java-ts-mode.el ends here
diff --git a/lisp/progmodes/json-ts-mode.el b/lisp/progmodes/json-ts-mode.el
new file mode 100644
index 0000000000..13eb5b78a9
--- /dev/null
+++ b/lisp/progmodes/json-ts-mode.el
@@ -0,0 +1,150 @@
+;;; json-ts-mode.el --- tree sitter support for JSON  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : json languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom json-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `json-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'json)
+
+(defvar json-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?$ "_"      table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?` "\""     table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `json-ts-mode'.")
+
+
+(defvar json-ts--indent-rules
+  `((json
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "object") parent-bol json-ts-mode-indent-offset))))
+
+(defvar json-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'json
+   :feature 'minimal
+   :override t
+   `((pair
+      key: (_) @font-lock-string-face)
+
+     (string) @font-lock-string-face
+
+     (number) @font-lock-constant-face
+
+     [(null) (true) (false)] @font-lock-constant-face
+
+     (escape_sequence) @font-lock-constant-face
+
+     (comment) @font-lock-comment-face))
+  "Font-lock settings for JSON.")
+
+(defun json-ts-mode--imenu-1 (node)
+  "Helper for `json-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'json-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (treesit-node-text
+                  (treesit-node-child-by-field-name
+                   ts-node "key")
+                  t)))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun json-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node "pair")))
+    (json-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode json-ts-mode prog-mode "JSON"
+  "Major mode for editing JSON, powered by Tree Sitter."
+  :group 'json
+  :syntax-table json-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'json)
+    (error "Tree Sitter for JSON isn't available"))
+
+  (treesit-parser-create 'json)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules json-ts--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "pair" "object")))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings json-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((minimal) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'json-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'json-ts-mode)
+
+;;; json-ts-mode.el ends here
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  5:43     ` Yuan Fu
                         ` (2 preceding siblings ...)
  2022-11-12  7:22       ` Theodor Thornhill via Emacs development discussions.
@ 2022-11-12  8:05       ` Eli Zaretskii
  2022-11-12  8:43         ` Theodor Thornhill
  3 siblings, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-12  8:05 UTC (permalink / raw)
  To: Yuan Fu; +Cc: theo, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Fri, 11 Nov 2022 21:43:21 -0800
> Cc: emacs-devel@gnu.org
> 
> I noticed that with the default indent style, Emacs indents like this:
> 
> int main () {
>               for (int j = 0; j < 5; j++)
>                 a[j] = 3;
>               int i = 1;
>               swap(i, a[i+1]);
>               Point p = {0, 1};
> }
> 
> int main ()
> {
>   for (int j = 0; j < 5; j++)
>     a[j] = 3;
>   int i = 1;
>   swap(i, a[i+1]);
>   Point p = {0, 1};
> }
> 
> Is this expected?

With which indentation style?



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  6:16       ` Theodor Thornhill
  2022-11-12  6:25         ` Yuan Fu
@ 2022-11-12  8:08         ` Eli Zaretskii
  2022-11-12  8:42           ` Theodor Thornhill
  1 sibling, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-12  8:08 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: casouri, emacs-devel

> Date: Sat, 12 Nov 2022 07:16:40 +0100
> From: Theodor Thornhill <theo@thornhill.no>
> CC: emacs-devel@gnu.org
> 
> The first one isn't gnu style, right? But it's an easy fix :)

We should support more than a single style.  Preferably all of them,
but IMO at the very least 'gnu', 'k&r', 'bsd', and 'linux'.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  8:08         ` Eli Zaretskii
@ 2022-11-12  8:42           ` Theodor Thornhill
  0 siblings, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-12  8:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, emacs-devel



On 12 November 2022 09:08:02 CET, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Sat, 12 Nov 2022 07:16:40 +0100
>> From: Theodor Thornhill <theo@thornhill.no>
>> CC: emacs-devel@gnu.org
>> 
>> The first one isn't gnu style, right? But it's an easy fix :)
>
>We should support more than a single style.  Preferably all of them,
>but IMO at the very least 'gnu', 'k&r', 'bsd', and 'linux'.

Yeah they should all be there. And it's easy to expand either by a function or with more styles before the merge.

I'm no style expert but it seems like most of the style differences are covered by what's included now.

Theo



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12  8:05       ` Eli Zaretskii
@ 2022-11-12  8:43         ` Theodor Thornhill
  0 siblings, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-12  8:43 UTC (permalink / raw)
  To: Eli Zaretskii, Yuan Fu; +Cc: emacs-devel



On 12 November 2022 09:05:25 CET, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Fri, 11 Nov 2022 21:43:21 -0800
>> Cc: emacs-devel@gnu.org
>> 
>> I noticed that with the default indent style, Emacs indents like this:
>> 
>> int main () {
>>               for (int j = 0; j < 5; j++)
>>                 a[j] = 3;
>>               int i = 1;
>>               swap(i, a[i+1]);
>>               Point p = {0, 1};
>> }
>> 
>> int main ()
>> {
>>   for (int j = 0; j < 5; j++)
>>     a[j] = 3;
>>   int i = 1;
>>   swap(i, a[i+1]);
>>   Point p = {0, 1};
>> }
>> 
>> Is this expected?
>
>With which indentation style?

Gnu, but I made this work in the gnu style as well.
Theo



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-11  6:01   ` Theodor Thornhill via Emacs development discussions.
  2022-11-12  5:43     ` Yuan Fu
@ 2022-11-12 12:21     ` Eli Zaretskii
  2022-11-12 19:38       ` Theodor Thornhill via Emacs development discussions.
  1 sibling, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-12 12:21 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: casouri, emacs-devel

> Cc: emacs-devel@gnu.org
> Date: Fri, 11 Nov 2022 07:01:02 +0100
> From:  Theodor Thornhill via "Emacs development discussions." <emacs-devel@gnu.org>
> 
> See below patch.  The struct issue was an ordering problem when I
> extracted the styles.  Should be good now

Thanks, a few minor comments below.

> diff --git a/etc/NEWS b/etc/NEWS
> index 9ed78bc6b3..3ce9810ece 100644
> --- a/etc/NEWS
> +++ b/etc/NEWS
> @@ -2786,8 +2786,32 @@ when visiting JSON files.
>  ** New mode ts-mode'.
>  Support is added for TypeScript, based on the new integration with
>  Tree-Sitter. There's support for font-locking, indentation and
> -navigation.  Tree-Sitter is required for this mode to function, but if
> -it is not available, we will default to use 'js-mode'.
> +navigation.  Tree-Sitter is required for this mode to function.
> +
> +** New mode c-ts-mode'.
> +Support is added for C, based on the new integration with
> +Tree-Sitter. There's support for font-locking, indentation and
> +navigation.  Tree-Sitter is required for this mode to function.

First, please use 2 spaces between sentences.

Next, I suggest a slightly different boilerplate for these items:

  ** New mode FOO-ts-mode.
  A major mode based on the Tree-sitter library for editing programs
  in the FOO language.  It includes support for font-locking,
  indentation, and navigation.

> +(defvar c++-ts-mode--preproc-keywords
> +  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif" "#include")
> +  "C++ keywords for tree-sitter font-locking.")

This list seems to be incomplete: what about the following?

  #error
  #warning
  #include_next
  #line
  #pragma

Doesn't Tree-sitter's C++ grammar support those?

> +(defvar c-ts-mode--keywords
> +  '("const" "default" "enum" "extern" "inline" "static"
> +    "struct" "typedef" "union" "volatile" "goto" "register"
> +    "sizeof" "return"
> +    "while" "for" "do" "continue" "break"
> +    "if" "else" "case" "switch")
> +  "C keywords for tree-sitter font-locking.")

Are these all the keywords in C?  Or is it just what Tree-sitter's C
parser support for now?

> +(defvar c-ts-mode--preproc-keywords
> +  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif"
> +    "#include")
> +  "C keywords for tree-sitter font-locking.")

Same comment here about additional preprocessor directives.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12 12:21     ` Eli Zaretskii
@ 2022-11-12 19:38       ` Theodor Thornhill via Emacs development discussions.
  2022-11-12 19:46         ` Stefan Kangas
  2022-11-12 19:51         ` Eli Zaretskii
  0 siblings, 2 replies; 83+ messages in thread
From: Theodor Thornhill via Emacs development discussions. @ 2022-11-12 19:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, emacs-devel, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 3004 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> Cc: emacs-devel@gnu.org
>> Date: Fri, 11 Nov 2022 07:01:02 +0100
>> From:  Theodor Thornhill via "Emacs development discussions." <emacs-devel@gnu.org>
>> 
>> See below patch.  The struct issue was an ordering problem when I
>> extracted the styles.  Should be good now
>
> Thanks, a few minor comments below.
>

Thanks!

>> diff --git a/etc/NEWS b/etc/NEWS
>> index 9ed78bc6b3..3ce9810ece 100644
>> --- a/etc/NEWS
>> +++ b/etc/NEWS
>> @@ -2786,8 +2786,32 @@ when visiting JSON files.
>>  ** New mode ts-mode'.
>>  Support is added for TypeScript, based on the new integration with
>>  Tree-Sitter. There's support for font-locking, indentation and
>> -navigation.  Tree-Sitter is required for this mode to function, but if
>> -it is not available, we will default to use 'js-mode'.
>> +navigation.  Tree-Sitter is required for this mode to function.
>> +
>> +** New mode c-ts-mode'.
>> +Support is added for C, based on the new integration with
>> +Tree-Sitter. There's support for font-locking, indentation and
>> +navigation.  Tree-Sitter is required for this mode to function.
>
> First, please use 2 spaces between sentences.
>

Yeah - my slipup.

> Next, I suggest a slightly different boilerplate for these items:
>
>   ** New mode FOO-ts-mode.
>   A major mode based on the Tree-sitter library for editing programs
>   in the FOO language.  It includes support for font-locking,
>   indentation, and navigation.
>

Sure - adopted.

>> +(defvar c++-ts-mode--preproc-keywords
>> +  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif" "#include")
>> +  "C++ keywords for tree-sitter font-locking.")
>
> This list seems to be incomplete: what about the following?
>
>   #error
>   #warning
>   #include_next
>   #line
>   #pragma
>
> Doesn't Tree-sitter's C++ grammar support those?
>

Doesn't seem like it does, no.


>> +(defvar c-ts-mode--keywords
>> +  '("const" "default" "enum" "extern" "inline" "static"
>> +    "struct" "typedef" "union" "volatile" "goto" "register"
>> +    "sizeof" "return"
>> +    "while" "for" "do" "continue" "break"
>> +    "if" "else" "case" "switch")
>> +  "C keywords for tree-sitter font-locking.")
>
> Are these all the keywords in C?  Or is it just what Tree-sitter's C
> parser support for now?
>

I added a few more.  Some were missing, that's true.

>> +(defvar c-ts-mode--preproc-keywords
>> +  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif"
>> +    "#include")
>> +  "C keywords for tree-sitter font-locking.")
>
> Same comment here about additional preprocessor directives.

Same answer here :)


See below patch.  Please note that I now have merged C and C++ into one
file, and both inherit from a new parent mode: 'c-ts-mode--base-mode'.
I simplified some of the indentation rules as well.

Unless some of you have any strong objections I think applying them to
feature/tree-sitter now is a good time, to allow other people to more
easily giving them a testrun.

What do you think?

Theo


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-Tree-Sitter-modes-for-C-like-languages.patch --]
[-- Type: text/x-diff, Size: 56960 bytes --]

From 0d6e144b9a90cfd72f5cdc8480d5bbe70c281fc5 Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Thu, 10 Nov 2022 17:15:49 +0100
Subject: [PATCH] Add Tree Sitter modes for C-like languages

* etc/NEWS: Mention the new modes
* lisp/progmodes/c-ts-mode.el: New major mode with Tree Sitter support
for C and C++.
* lisp/progmodes/java-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/json-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/css-ts-mode.el: New major mode with Tree Sitter support.
---
 etc/NEWS                       |  32 ++-
 lisp/progmodes/c++-ts-mode.el  | 424 +++++++++++++++++++++++++++++++
 lisp/progmodes/c-ts-mode.el    | 445 +++++++++++++++++++++++++++++++++
 lisp/progmodes/css-ts-mode.el  | 131 ++++++++++
 lisp/progmodes/java-ts-mode.el | 289 +++++++++++++++++++++
 lisp/progmodes/json-ts-mode.el | 150 +++++++++++
 6 files changed, 1467 insertions(+), 4 deletions(-)
 create mode 100644 lisp/progmodes/c++-ts-mode.el
 create mode 100644 lisp/progmodes/c-ts-mode.el
 create mode 100644 lisp/progmodes/css-ts-mode.el
 create mode 100644 lisp/progmodes/java-ts-mode.el
 create mode 100644 lisp/progmodes/json-ts-mode.el

diff --git a/etc/NEWS b/etc/NEWS
index 9ed78bc6b3..5e655a953d 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -2784,10 +2784,34 @@ when visiting JSON files.
 
 \f
 ** New mode ts-mode'.
-Support is added for TypeScript, based on the new integration with
-Tree-Sitter. There's support for font-locking, indentation and
-navigation.  Tree-Sitter is required for this mode to function, but if
-it is not available, we will default to use 'js-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the FOO language.  It includes support for font-locking,
+indentation, and navigation.
+
+** New mode c-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the C language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
+
+** New mode c++-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the C++ language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
+
+** New mode java-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the Java language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
+
+** New mode css-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the CSS language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
+
+** New mode json-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the JSON language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
 
 \f
 * Incompatible Lisp Changes in Emacs 29.1
diff --git a/lisp/progmodes/c++-ts-mode.el b/lisp/progmodes/c++-ts-mode.el
new file mode 100644
index 0000000000..7e3bb7d141
--- /dev/null
+++ b/lisp/progmodes/c++-ts-mode.el
@@ -0,0 +1,424 @@
+;;; c++-ts-mode.el --- tree sitter support for C++  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : c++ languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom c++-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `c++-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'cpp)
+
+(defcustom c++-ts-mode-indent-style 'gnu
+  "Style used for indentation.
+
+The selected style could be one of GNU, K&R, LINUX or BSD.  If
+one of the supplied styles doesn't suffice a function could be
+set instead.  This function is expected return a list that
+follows the form of `treesit-simple-indent-rules'."
+  :type '(choice (symbol :tag "Gnu" 'gnu)
+                 (symbol :tag "K&R" 'k&r)
+                 (symbol :tag "Linux" 'linux)
+                 (symbol :tag "BSD" 'bsd)
+                 (function :tag "A function for user customized style" ignore))
+  :group 'cpp)
+
+(defvar c++-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    (modify-syntax-entry ?/  ". 124b" table)
+    (modify-syntax-entry ?*  ". 23"   table)
+    table)
+  "Syntax table for `c++-ts-mode'.")
+
+(defvar c++-ts-mode--indent-styles
+  (let ((common
+         `(((parent-is "translation_unit") parent-bol 0)
+           ((node-is ")") parent 1)
+           ((node-is "]") parent-bol 0)
+           ((node-is "else") parent-bol 0)
+           ((node-is "case") parent-bol 0)
+           ((node-is "comment") no-indent)
+           ((parent-is "comment") no-indent)
+           ((node-is "labeled_statement") parent-bol 0)
+           ((parent-is "labeled_statement") parent-bol c++-ts-mode-indent-offset)
+           ((match "preproc_ifdef" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_ifdef") point-min 0)
+           ((match "preproc_if" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_if") point-min 0)
+           ((match "preproc_function_def" "compound_statement") point-min 0)
+           ((match "preproc_call" "compound_statement") point-min 0)
+           ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+           ((node-is "field_initializer_list") parent-bol ,(* c++-ts-mode-indent-offset 2))
+           ((parent-is "function_definition") parent-bol 0)
+           ((parent-is "conditional_expression") first-sibling 0)
+           ((parent-is "assignment_expression") parent-bol c++-ts-mode-indent-offset)
+           ((parent-is "comma_expression") first-sibling 0)
+           ((parent-is "init_declarator") parent-bol c++-ts-mode-indent-offset)
+           ((parent-is "parenthesized_expression") first-sibling 1)
+           ((parent-is "argument_list") first-sibling 1)
+           ((parent-is "parameter_list") first-sibling 1)
+           ((parent-is "binary_expression") parent 0)
+           ((query "(for_statement initializer: (_) @indent)") parent-bol 5)
+           ((query "(for_statement condition: (_) @indent)") parent-bol 5)
+           ((query "(for_statement update: (_) @indent)") parent-bol 5)
+           ((query "(call_expression arguments: (_) @indent)") parent c++-ts-mode-indent-offset))))
+    `((gnu
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent-bol c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((match "while" "do_statement") parent 0)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (k&r
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (linux
+       ,@common
+       ((node-is "}") grand-parent 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") grand-parent c++-ts-mode-indent-offset)
+       ((parent-is "if_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "for_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "while_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "switch_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "case_statement") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "do_statement") parent-bol c++-ts-mode-indent-offset)
+       (no-node parent-bol c++-ts-mode-indent-offset))
+      (bsd
+       ,@common
+       ((node-is "}") parent-bol 0)
+       ((parent-is "enumerator_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "field_declaration_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "initializer_list") parent-bol c++-ts-mode-indent-offset)
+       ((parent-is "compound_statement") parent 0)
+       ((parent-is "if_statement") parent-bol 0)
+       ((parent-is "for_statement") parent-bol 0)
+       ((parent-is "while_statement") parent-bol 0)
+       ((parent-is "switch_statement") parent-bol 0)
+       ((parent-is "case_statement") parent-bol 0)
+       ((parent-is "do_statement") parent-bol 0)
+       (no-node parent-bol c++-ts-mode-indent-offset))))
+  "Indent rules supported by `c++-ts-mode'.")
+
+(defun c++-ts-mode--set-indent-style ()
+  "Helper function to set indentation style."
+  (let ((style
+         (if (functionp c++-ts-mode-indent-style)
+             (funcall c++-ts-mode-indent-style)
+           (pcase c++-ts-mode-indent-style
+             ('gnu   (alist-get 'gnu c++-ts-mode--indent-styles))
+             ('k&r   (alist-get 'k&r c++-ts-mode--indent-styles))
+             ('bsd   (alist-get 'bsd c++-ts-mode--indent-styles))
+             ('linux (alist-get 'linux c++-ts-mode--indent-styles))))))
+    `((cpp ,@style))))
+
+(defvar c++-ts-mode--keywords
+  '("and" "and_eq" "bitand" "bitor" "catch" "class" "co_await"
+    "co_return" "co_yield" "compl" "concept" "consteval" "constexpr"
+    "constinit" "decltype" "delete" "else" "explicit" "final" "for"
+    "friend" "friend" "if" "mutable" "namespace" "new" "noexcept"
+    "not" "not_eq" "operator" "or" "or_eq" "override" "private"
+    "protected" "public" "requires" "return" "static" "struct"
+    "template" "throw" "try" "typename" "using" "virtual" "xor" "xor_eq"
+    "switch" "case")
+  "C++ keywords for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--preproc-keywords
+  '("#define" "#if" "#ifdef" "#ifndef" "#else" "#elif" "#endif" "#include")
+  "C++ keywords for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--operators
+  '("=" "-" "*" "/" "+" "%" "~" "|" "&" "^" "<<" ">>" "->"
+    "." "<" "<=" ">=" ">" "==" "!=" "!" "&&" "||" "-="
+    "+=" "*=" "/=" "%=" "|=" "&=" "^=" ">>=" "<<=" "--" "++")
+  "C++ operators for tree-sitter font-locking.")
+
+(defvar c++-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'cpp
+   :override t
+   :feature 'comment
+   `((comment) @font-lock-comment-face
+     (comment) @contexual)
+   :language 'cpp
+   :override t
+   :feature 'preprocessor
+   `((preproc_directive) @font-lock-preprocessor-face
+
+     (preproc_def
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_ifdef
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_function_def
+      name: (identifier) @font-lock-function-name-face)
+
+     (preproc_params
+      (identifier) @font-lock-variable-name-face)
+
+     (preproc_defined) @font-lock-preprocessor-face
+     (preproc_defined (identifier) @font-lock-variable-name-face)
+     [,@c++-ts-mode--preproc-keywords] @font-lock-preprocessor-face)
+   :language 'cpp
+   :override t
+   :feature 'constant
+   `((true) @font-lock-constant-face
+     (false) @font-lock-constant-face
+     (null) @font-lock-constant-face
+     (this) @font-lock-constant-face)
+   :language 'cpp
+   :override t
+   :feature 'keyword
+   `([,@c++-ts-mode--keywords] @font-lock-keyword-face
+     (auto) @font-lock-keyword-face)
+   :language 'cpp
+   :override t
+   :feature 'operator
+   `([,@c++-ts-mode--operators] @font-lock-builtin-face)
+   :language 'cpp
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face
+     ((string_literal)) @contextual
+     (system_lib_string) @font-lock-string-face
+     (escape_sequence) @font-lock-string-face)
+   :language 'cpp
+   :override t
+   :feature 'literal
+   `((number_literal) @font-lock-constant-face
+     (char_literal) @font-lock-constant-face)
+   :language 'cpp
+   :override t
+   :feature 'type
+   '((primitive_type) @font-lock-type-face
+     (type_qualifier) @font-lock-type-face
+
+     (qualified_identifier
+      scope: (namespace_identifier) @font-lock-type-face)
+
+     (operator_cast)  type: (type_identifier) @font-lock-type-face)
+   :language 'cpp
+   :override t
+   :feature 'definition
+   `((declaration
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (field_declaration
+      declarator: (field_identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_definition
+      type: (type_identifier) @font-lock-function-name-face)
+
+     (function_declarator
+      declarator: (identifier) @font-lock-function-name-face)
+
+     (array_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (init_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (struct_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (sized_type_specifier) @font-lock-type-face
+
+     (enum_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (enumerator
+      name: (identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (_) @font-lock-type-face
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (field_identifier) @font-lock-variable-name-face))
+   :language 'cpp
+   :override t
+   :feature 'expression
+   '((assignment_expression
+      left: (identifier) @font-lock-variable-name-face)
+
+     (call_expression
+      function: (identifier) @font-lock-function-name-face)
+
+     (field_expression
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (field_expression
+      argument: (identifier) @font-lock-variable-name-face
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (pointer_expression
+      argument: (identifier) @font-lock-variable-name-face))
+   :language 'cpp
+   :override t
+   :feature 'statement
+   '((expression_statement (identifier) @font-lock-variable-name-face)
+     (labeled_statement
+      label: (statement_identifier) @font-lock-type-face))
+   :language 'cpp
+   :override t
+   :feature 'error
+   '((ERROR) @font-lock-warning-face))
+  "Tree-sitter font-lock settings.")
+
+(defun c++-ts-mode--imenu-1 (node)
+  "Helper for `c++-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'c++-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "declarator")
+                          (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    ;; A struct_specifier could be inside a parameter list or another
+    ;; struct definition.  In those cases we don't include it.
+    (cond
+     ((string-match-p
+       (rx (or "parameter" "field") "_declaration")
+       (or (treesit-node-type (treesit-node-parent ts-node))
+           ""))
+      nil)
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun c++-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "function_definition"
+                             "struct_specifier")))))
+    (c++-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode c++-ts-mode prog-mode "C++"
+  "Major mode for editing C++, powered by Tree Sitter."
+  :group 'cpp
+  :syntax-table c++-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'cpp)
+    (error "Tree Sitter for C++ isn't available"))
+
+  (treesit-parser-create 'cpp)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (when (eq c++-ts-mode-indent-style 'linux)
+    (setq-local indent-tabs-mode t))
+  (setq-local treesit-simple-indent-rules
+              (c++-ts-mode--set-indent-style))
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "specifier"
+                      "definition")))
+
+  ;; Electric
+  (setq-local electric-indent-chars
+	      (append "{}():;," electric-indent-chars))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings c++-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((comment preprocessor operator constant string literal keyword)
+                (type definition expression statement)
+                (error)))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'c++-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'c++-ts-mode)
+
+;;; c++-ts-mode.el ends here
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
new file mode 100644
index 0000000000..ab3e1ad6d6
--- /dev/null
+++ b/lisp/progmodes/c-ts-mode.el
@@ -0,0 +1,445 @@
+;;; c-ts-mode.el --- tree sitter support for C  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : c languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom c-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `c-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'c)
+
+(defcustom c-ts-mode-indent-style 'gnu
+  "Style used for indentation.
+
+The selected style could be one of GNU, K&R, LINUX or BSD.  If
+one of the supplied styles doesn't suffice a function could be
+set instead.  This function is expected return a list that
+follows the form of `treesit-simple-indent-rules'."
+  :type '(choice (symbol :tag "Gnu" 'gnu)
+                 (symbol :tag "K&R" 'k&r)
+                 (symbol :tag "Linux" 'linux)
+                 (symbol :tag "BSD" 'bsd)
+                 (function :tag "A function for user customized style" ignore))
+  :group 'c)
+
+(defvar c-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    (modify-syntax-entry ?/  ". 124b" table)
+    (modify-syntax-entry ?*  ". 23"   table)
+    table)
+  "Syntax table for `c-ts-mode'.")
+
+(defun c-ts-mode--indent-styles (mode)
+  "Indent rules supported by `c-ts-mode'.
+MODE is either `c' or `cpp'."
+  (let ((common
+         `(((parent-is "translation_unit") parent-bol 0)
+           ((node-is ")") parent 1)
+           ((node-is "]") parent-bol 0)
+           ((node-is "}") (and parent parent-bol) 0)
+           ((node-is "else") parent-bol 0)
+           ((node-is "case") parent-bol 0)
+           ((node-is "comment") no-indent)
+           ((parent-is "comment") no-indent)
+           ((node-is "labeled_statement") parent-bol 0)
+           ((parent-is "labeled_statement") parent-bol c-ts-mode-indent-offset)
+           ((match "preproc_ifdef" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_ifdef") point-min 0)
+           ((match "preproc_if" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_if") point-min 0)
+           ((match "preproc_function_def" "compound_statement") point-min 0)
+           ((match "preproc_call" "compound_statement") point-min 0)
+           ((parent-is "compound_statement") (and parent parent-bol) c-ts-mode-indent-offset)
+           ((parent-is "function_definition") parent-bol 0)
+           ((parent-is "conditional_expression") first-sibling 0)
+           ((parent-is "assignment_expression") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "comma_expression") first-sibling 0)
+           ((parent-is "init_declarator") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "parenthesized_expression") first-sibling 1)
+           ((parent-is "argument_list") first-sibling 1)
+           ((parent-is "parameter_list") first-sibling 1)
+           ((parent-is "binary_expression") parent 0)
+           ((query "(for_statement initializer: (_) @indent)") parent-bol 5)
+           ((query "(for_statement condition: (_) @indent)") parent-bol 5)
+           ((query "(for_statement update: (_) @indent)") parent-bol 5)
+           ((query "(call_expression arguments: (_) @indent)") parent c-ts-mode-indent-offset)
+           ((parent-is "call_expression") parent 0)
+           ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+           ,@(when (eq mode 'cpp)
+               `(((node-is "field_initializer_list") parent-bol ,(* c-ts-mode-indent-offset 2))))
+           (no-node parent-bol c-ts-mode-indent-offset))))
+    `((gnu
+       ;; Prepend rules to set highest priority
+       ((match "while" "do_statement") parent 0)
+       ,@common)
+      (k&r ,@common)
+      (linux ,@common)
+      (bsd
+       ((parent-is "if_statement") parent-bol 0)
+       ((parent-is "for_statement") parent-bol 0)
+       ((parent-is "while_statement") parent-bol 0)
+       ((parent-is "switch_statement") parent-bol 0)
+       ((parent-is "case_statement") parent-bol 0)
+       ((parent-is "do_statement") parent-bol 0)
+       ,@common))))
+
+(defun c-ts-mode--set-indent-style (mode)
+  "Helper function to set indentation style.
+MODE is either `c' or `cpp'."
+  (let ((style
+         (if (functionp c-ts-mode-indent-style)
+             (funcall c-ts-mode-indent-style)
+           (pcase c-ts-mode-indent-style
+             ('gnu   (alist-get 'gnu (c-ts-mode--indent-styles mode)))
+             ('k&r   (alist-get 'k&r (c-ts-mode--indent-styles mode)))
+             ('bsd   (alist-get 'bsd (c-ts-mode--indent-styles mode)))
+             ('linux (alist-get 'linux (c-ts-mode--indent-styles mode)))))))
+    `((,mode ,@style))))
+
+(defvar c-ts-mode--preproc-keywords
+  '("#define" "#if" "#ifdef" "#ifndef"
+    "#else" "#elif" "#endif" "#include")
+  "C/C++ keywords for tree-sitter font-locking.")
+
+(defun c-ts-mode--keywords (mode)
+  "C/C++ keywords for tree-sitter font-locking.
+MODE is either `c' or `cpp'."
+  (let ((c-keywords
+         '("break" "case" "const" "continue"
+           "default" "do" "else" "enum"
+           "extern" "for" "goto" "if"
+           "long" "register" "return" "short"
+           "signed" "sizeof" "static" "struct"
+           "switch" "typedef" "union" "unsigned"
+           "volatile" "while")))
+    (if (eq mode 'cpp)
+        (append c-keywords
+                '("and" "and_eq" "bitand" "bitor"
+                  "catch" "class" "co_await" "co_return"
+                  "co_yield" "compl" "concept" "consteval"
+                  "constexpr" "constinit" "decltype" "delete"
+                  "explicit" "final" "friend" "friend"
+                  "mutable" "namespace" "new" "noexcept"
+                  "not" "not_eq" "operator" "or"
+                  "or_eq" "override" "private" "protected"
+                  "public" "requires" "template" "throw"
+                  "try" "typename" "using" "virtual"
+                  "xor" "xor_eq"))
+      (append '("auto") c-keywords))))
+
+(defvar c-ts-mode--operators
+  '("=" "-" "*" "/" "+" "%" "~" "|" "&" "^" "<<" ">>" "->"
+    "." "<" "<=" ">=" ">" "==" "!=" "!" "&&" "||" "-="
+    "+=" "*=" "/=" "%=" "|=" "&=" "^=" ">>=" "<<=" "--" "++")
+  "C/C++ operators for tree-sitter font-locking.")
+
+(defun c-ts-mode--font-lock-settings (mode)
+  "Tree-sitter font-lock settings.
+MODE is either `c' or `cpp'."
+  (treesit-font-lock-rules
+   :language mode
+   :override t
+   :feature 'comment
+   `((comment) @font-lock-comment-face
+     (comment) @contexual)
+   :language mode
+   :override t
+   :feature 'preprocessor
+   `((preproc_directive) @font-lock-preprocessor-face
+
+     (preproc_def
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_ifdef
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_function_def
+      name: (identifier) @font-lock-function-name-face)
+
+     (preproc_params
+      (identifier) @font-lock-variable-name-face)
+
+     (preproc_defined) @font-lock-preprocessor-face
+     (preproc_defined (identifier) @font-lock-variable-name-face)
+     [,@c-ts-mode--preproc-keywords] @font-lock-preprocessor-face)
+   :language mode
+   :override t
+   :feature 'constant
+   `((true) @font-lock-constant-face
+     (false) @font-lock-constant-face
+     (null) @font-lock-constant-face
+     ,@(when (eq mode 'cpp)
+         '((this) @font-lock-constant-face)))
+   :language mode
+   :override t
+   :feature 'keyword
+   `([,@(c-ts-mode--keywords mode)] @font-lock-keyword-face
+     ,@(when (eq mode 'cpp)
+         '((auto) @font-lock-keyword-face)))
+   :language mode
+   :override t
+   :feature 'operator
+   `([,@c-ts-mode--operators] @font-lock-builtin-face)
+   :language mode
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face
+     ((string_literal)) @contextual
+     (system_lib_string) @font-lock-string-face
+     (escape_sequence) @font-lock-string-face)
+   :language mode
+   :override t
+   :feature 'literal
+   `((number_literal) @font-lock-constant-face
+     (char_literal) @font-lock-constant-face)
+   :language mode
+   :override t
+   :feature 'type
+   `((primitive_type) @font-lock-type-face
+     ,@(when (eq mode 'cpp)
+         '((type_qualifier) @font-lock-type-face
+
+           (qualified_identifier
+            scope: (namespace_identifier) @font-lock-type-face)
+
+           (operator_cast) type: (type_identifier) @font-lock-type-face)))
+   :language mode
+   :override t
+   :feature 'definition
+   `((declaration
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (field_declaration
+      declarator: (field_identifier) @font-lock-variable-name-face)
+
+     (field_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (parameter_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_definition
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_declarator
+      declarator: (identifier) @font-lock-function-name-face)
+
+     (array_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (init_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (struct_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (sized_type_specifier) @font-lock-type-face
+
+     (enum_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (enumerator
+      name: (identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (_) @font-lock-type-face
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (field_identifier) @font-lock-variable-name-face))
+   :language mode
+   :override t
+   :feature 'expression
+   '((assignment_expression
+      left: (identifier) @font-lock-variable-name-face)
+
+     (call_expression
+      function: (identifier) @font-lock-function-name-face)
+
+     (field_expression
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (field_expression
+      argument: (identifier) @font-lock-variable-name-face
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (pointer_expression
+      argument: (identifier) @font-lock-variable-name-face))
+   :language mode
+   :override t
+   :feature 'statement
+   '((expression_statement (identifier) @font-lock-variable-name-face)
+     (labeled_statement
+      label: (statement_identifier) @font-lock-type-face))
+   :language mode
+   :override t
+   :feature 'error
+   '((ERROR) @font-lock-warning-face)))
+
+(defun c-ts-mode--imenu-1 (node)
+  "Helper for `c-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'c-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "declarator")
+                          (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    ;; A struct_specifier could be inside a parameter list or another
+    ;; struct definition.  In those cases we don't include it.
+    (cond
+     ((string-match-p
+       (rx (or "parameter" "field") "_declaration")
+       (or (treesit-node-type (treesit-node-parent ts-node))
+           ""))
+      nil)
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun c-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "function_definition"
+                             "struct_specifier")))))
+    (c-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode c-ts-mode--base-mode prog-mode "C"
+  "Major mode for editing C, powered by Tree Sitter."
+  :group 'c
+  :syntax-table c-ts-mode--syntax-table
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "specifier"
+                      "definition")))
+
+  ;; Indent.
+  (when (eq c-ts-mode-indent-style 'linux)
+    (setq-local indent-tabs-mode t))
+
+  ;; Electric
+  (setq-local electric-indent-chars
+	      (append "{}():;," electric-indent-chars))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'c-ts-mode--imenu)
+  (setq-local which-func-functions nil)
+
+  (setq-local treesit-font-lock-feature-list
+              '((comment preprocessor operator constant string literal keyword)
+                (type definition expression statement)
+                (error))))
+
+;;;###autoload
+(define-derived-mode c-ts-mode c-ts-mode--base-mode "C"
+  "Major mode for editing C, powered by Tree Sitter."
+  :group 'c
+
+  (unless (treesit-ready-p nil 'c)
+    (error "Tree Sitter for C isn't available"))
+
+  (treesit-parser-create 'c)
+
+  (setq-local treesit-simple-indent-rules
+              (c-ts-mode--set-indent-style 'c))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings (c-ts-mode--font-lock-settings 'c))
+
+  (treesit-major-mode-setup))
+
+;;;###autoload
+(define-derived-mode c++-ts-mode c-ts-mode--base-mode "C++"
+  "Major mode for editing C, powered by Tree Sitter."
+  :group 'c++
+
+  (unless (treesit-ready-p nil 'cpp)
+    (error "Tree Sitter for C++ isn't available"))
+
+  (treesit-parser-create 'cpp)
+
+  (setq-local treesit-simple-indent-rules
+              (c-ts-mode--set-indent-style 'cpp))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings (c-ts-mode--font-lock-settings 'cpp))
+
+  (treesit-major-mode-setup))
+
+(provide 'c-ts-mode)
+
+;;; c-ts-mode.el ends here
diff --git a/lisp/progmodes/css-ts-mode.el b/lisp/progmodes/css-ts-mode.el
new file mode 100644
index 0000000000..c1a8d4e94d
--- /dev/null
+++ b/lisp/progmodes/css-ts-mode.el
@@ -0,0 +1,131 @@
+;;; css-ts-mode.el --- tree sitter support for CSS  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : css languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+(require 'css-mode)
+
+(defcustom css-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'css)
+
+(defvar css-ts-mode--indent-rules
+  `((css
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+
+     ((parent-is "block") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "arguments") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "declaration") parent-bol css-ts-mode-indent-offset))))
+
+(defvar css-ts-mode--settings
+  (treesit-font-lock-rules
+   :language 'css
+   :feature 'basic
+   :override t
+   `((unit) @font-lock-constant-face
+     (integer_value) @font-lock-builtin-face
+     (float_value) @font-lock-builtin-face
+     (plain_value) @font-lock-variable-name-face
+     (comment) @font-lock-comment-face
+     (class_selector) @css-selector
+     (child_selector) @css-selector
+     (id_selector) @css-selector
+     (tag_name) @css-selector
+     (property_name) @css-property
+     (class_name) @css-selector
+     (function_name) @font-lock-function-name-face)))
+
+(defun css-ts-mode--imenu-1 (node)
+  "Helper for `css-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'css-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (if (equal (treesit-node-type ts-node) "tag_name")
+                     (treesit-node-text ts-node)
+                   (treesit-node-text (treesit-node-child ts-node 1) t))))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun css-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_selector"
+                             "id_selector"
+                             "tag_name")))))
+    (css-ts-mode--imenu-1 tree)))
+
+(define-derived-mode css-ts-mode prog-mode "CSS"
+  "Major mode for editing CSS."
+  :group 'css
+  :syntax-table css-mode-syntax-table
+
+  (unless (treesit-ready-p nil 'css)
+    (error "Tree Sitter for CSS isn't available"))
+
+  (treesit-parser-create 'css)
+
+  ;; Comments
+  (setq-local comment-start "/*")
+  (setq-local comment-start-skip "/\\*+[ \t]*")
+  (setq-local comment-end "*/")
+  (setq-local comment-end-skip "[ \t]*\\*+/")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules css-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp "rule_set")
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings css-ts-mode--settings)
+  (setq treesit-font-lock-feature-list '((basic) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'css-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'css-ts-mode)
+
+;;; css-ts-mode.el ends here
diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
new file mode 100644
index 0000000000..734a8be471
--- /dev/null
+++ b/lisp/progmodes/java-ts-mode.el
@@ -0,0 +1,289 @@
+;;; java-ts-mode.el --- tree sitter support for Java  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : java languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom java-ts-mode-indent-offset 4
+  "Number of spaces for each indentation step in `java-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'java)
+
+(defvar java-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `java-ts-mode'.")
+
+(defvar java-ts-mode--indent-rules
+  `((java
+     ((parent-is "program") parent-bol 0)
+     ((node-is "}") (and parent parent-bol) 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "class_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "interface_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "constructor_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "enum_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_block") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "record_declaration_body") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block _ @indent))") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block (_) @indent))") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "variable_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "method_invocation") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_rule") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "ternary_expression") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "element_value_array_initializer") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "function_definition") parent-bol 0)
+     ((parent-is "conditional_expression") first-sibling 0)
+     ((parent-is "assignment_expression") parent-bol 2)
+     ((parent-is "binary_expression") parent 0)
+     ((parent-is "parenthesized_expression") first-sibling 1)
+     ((parent-is "argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "annotation_argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "modifiers") parent-bol 0)
+     ((parent-is "formal_parameters") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "formal_parameter") parent-bol 0)
+     ((parent-is "init_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "if_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "for_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "while_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "case_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "labeled_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "do_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "block") (and parent parent-bol) java-ts-mode-indent-offset)))
+  "Tree-sitter indent rules.")
+
+(defvar java-ts-mode--keywords
+  '("abstract" "assert" "break" "case" "catch"
+    "class" "continue" "default" "do" "else"
+    "enum" "exports" "extends" "final" "finally"
+    "for" "if" "implements" "import" "instanceof"
+    "interface" "module" "native" "new" "non-sealed"
+    "open" "opens" "package" "private" "protected"
+    "provides" "public" "requires" "return" "sealed"
+    "static" "strictfp" "switch" "synchronized"
+    "throw" "throws" "to" "transient" "transitive"
+    "try" "uses" "volatile" "while" "with" "record")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar java-ts-mode--operators
+  '("@" "+" ":" "++" "-" "--" "&" "&&" "|" "||"
+    "!=" "==" "*" "/" "%" "<" "<=" ">" ">=" "="
+    "-=" "+=" "*=" "/=" "%=" "->" "^" "^=" "&="
+    "|=" "~" ">>" ">>>" "<<" "::" "?")
+  "C operators for tree-sitter font-locking.")
+
+(defvar java-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'java
+   :override t
+   :feature 'basic
+   '((identifier) @font-lock-variable-name-face)
+   :language 'java
+   :override t
+   :feature 'comment
+   `((line_comment) @font-lock-comment-face
+     (block_comment) @font-lock-comment-face)
+   :language 'java
+   :override t
+   :feature 'constant
+   `(((identifier) @font-lock-constant-face
+      (:match "^[A-Z_][A-Z_\\d]*$" @font-lock-constant-face))
+     (true) @font-lock-constant-face
+     (false) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'keyword
+   `([,@java-ts-mode--keywords] @font-lock-keyword-face
+     (labeled_statement
+      (identifier) @font-lock-keyword-face))
+   :language 'java
+   :override t
+   :feature 'operator
+   `([,@java-ts-mode--operators] @font-lock-builtin-face)
+   :language 'java
+   :override t
+   :feature 'annotation
+   `((annotation
+      name: (identifier) @font-lock-constant-face)
+
+     (marker_annotation
+      name: (identifier) @font-lock-constant-face))
+   :language 'java
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face)
+   :language 'java
+   :override t
+   :feature 'literal
+   `((null_literal) @font-lock-constant-face
+     (decimal_floating_point_literal)  @font-lock-constant-face
+     (hex_floating_point_literal) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'type
+   '((interface_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (class_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (record_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (enum_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (constructor_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (field_access
+      object: (identifier) @font-lock-type-face)
+
+     (method_reference (identifier) @font-lock-type-face)
+
+     ((scoped_identifier name: (identifier) @font-lock-type-face)
+      (:match "^[A-Z]" @font-lock-type-face))
+
+     (type_identifier) @font-lock-type-face
+
+     [(boolean_type)
+      (integral_type)
+      (floating_point_type)
+      (void_type)] @font-lock-type-face)
+   :language 'java
+   :override t
+   :feature 'definition
+   `((method_declaration
+      name: (identifier) @font-lock-function-name-face)
+
+     (formal_parameter
+      name: (identifier) @font-lock-variable-name-face)
+
+     (catch_formal_parameter
+      name: (identifier) @font-lock-variable-name-face))
+   :language 'java
+   :override t
+   :feature 'expression
+   '((method_invocation
+      object: (identifier) @font-lock-variable-name-face)
+
+     (method_invocation
+      name: (identifier) @font-lock-function-name-face)
+
+     (argument_list (identifier) @font-lock-variable-name-face)))
+  "Tree-sitter font-lock settings.")
+
+(defun java-ts-mode--imenu-1 (node)
+  "Helper for `java-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'java-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun java-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_declaration"
+                             "interface_declaration"
+                             "enum_declaration"
+                             "record_declaration"
+                             "method_declaration")))))
+    (java-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode java-ts-mode prog-mode "Java"
+  "Major mode for editing Java, powered by Tree Sitter."
+  :group 'c
+  :syntax-table java-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'java)
+    (error "Tree-sitter for Java isn't available"))
+
+  (treesit-parser-create 'java)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules java-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "declaration")))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings java-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((basic comment keyword constant string operator)
+                (type definition expression literal annotation)
+                ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'java-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'java-ts-mode)
+
+;;; java-ts-mode.el ends here
diff --git a/lisp/progmodes/json-ts-mode.el b/lisp/progmodes/json-ts-mode.el
new file mode 100644
index 0000000000..13eb5b78a9
--- /dev/null
+++ b/lisp/progmodes/json-ts-mode.el
@@ -0,0 +1,150 @@
+;;; json-ts-mode.el --- tree sitter support for JSON  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : json languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom json-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `json-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'json)
+
+(defvar json-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?$ "_"      table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?` "\""     table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `json-ts-mode'.")
+
+
+(defvar json-ts--indent-rules
+  `((json
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "object") parent-bol json-ts-mode-indent-offset))))
+
+(defvar json-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'json
+   :feature 'minimal
+   :override t
+   `((pair
+      key: (_) @font-lock-string-face)
+
+     (string) @font-lock-string-face
+
+     (number) @font-lock-constant-face
+
+     [(null) (true) (false)] @font-lock-constant-face
+
+     (escape_sequence) @font-lock-constant-face
+
+     (comment) @font-lock-comment-face))
+  "Font-lock settings for JSON.")
+
+(defun json-ts-mode--imenu-1 (node)
+  "Helper for `json-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'json-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (treesit-node-text
+                  (treesit-node-child-by-field-name
+                   ts-node "key")
+                  t)))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun json-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node "pair")))
+    (json-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode json-ts-mode prog-mode "JSON"
+  "Major mode for editing JSON, powered by Tree Sitter."
+  :group 'json
+  :syntax-table json-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'json)
+    (error "Tree Sitter for JSON isn't available"))
+
+  (treesit-parser-create 'json)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules json-ts--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "pair" "object")))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings json-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((minimal) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'json-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'json-ts-mode)
+
+;;; json-ts-mode.el ends here
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12 19:38       ` Theodor Thornhill via Emacs development discussions.
@ 2022-11-12 19:46         ` Stefan Kangas
  2022-11-12 20:03           ` Theodor Thornhill
  2022-11-12 19:51         ` Eli Zaretskii
  1 sibling, 1 reply; 83+ messages in thread
From: Stefan Kangas @ 2022-11-12 19:46 UTC (permalink / raw)
  To: Theodor Thornhill, Eli Zaretskii; +Cc: casouri, emacs-devel, Stefan Monnier

Theodor Thornhill via "Emacs development discussions."
<emacs-devel@gnu.org> writes:

> See below patch.  Please note that I now have merged C and C++ into one
> file, and both inherit from a new parent mode: 'c-ts-mode--base-mode'.
> I simplified some of the indentation rules as well.

Is it really a good idea to put C and C++ support in the same file?

Also, the patch seems to be adding two files, but the commit message
only mentions one of them?

> From 0d6e144b9a90cfd72f5cdc8480d5bbe70c281fc5 Mon Sep 17 00:00:00 2001
> From: Theodor Thornhill <theo@thornhill.no>
> Date: Thu, 10 Nov 2022 17:15:49 +0100
> Subject: [PATCH] Add Tree Sitter modes for C-like languages
>
> * etc/NEWS: Mention the new modes
> * lisp/progmodes/c-ts-mode.el: New major mode with Tree Sitter support
> for C and C++.
> * lisp/progmodes/java-ts-mode.el: New major mode with Tree Sitter support.
> * lisp/progmodes/json-ts-mode.el: New major mode with Tree Sitter support.
> * lisp/progmodes/css-ts-mode.el: New major mode with Tree Sitter support.
> ---
>  etc/NEWS                       |  32 ++-
>  lisp/progmodes/c++-ts-mode.el  | 424 +++++++++++++++++++++++++++++++
>  lisp/progmodes/c-ts-mode.el    | 445 +++++++++++++++++++++++++++++++++
>  lisp/progmodes/css-ts-mode.el  | 131 ++++++++++
>  lisp/progmodes/java-ts-mode.el | 289 +++++++++++++++++++++
>  lisp/progmodes/json-ts-mode.el | 150 +++++++++++
>  6 files changed, 1467 insertions(+), 4 deletions(-)
>  create mode 100644 lisp/progmodes/c++-ts-mode.el
>  create mode 100644 lisp/progmodes/c-ts-mode.el
>  create mode 100644 lisp/progmodes/css-ts-mode.el
>  create mode 100644 lisp/progmodes/java-ts-mode.el
>  create mode 100644 lisp/progmodes/json-ts-mode.el
[...]



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12 19:38       ` Theodor Thornhill via Emacs development discussions.
  2022-11-12 19:46         ` Stefan Kangas
@ 2022-11-12 19:51         ` Eli Zaretskii
  2022-11-12 20:05           ` Theodor Thornhill via Emacs development discussions.
  1 sibling, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-12 19:51 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: casouri, emacs-devel, monnier

> From: Theodor Thornhill <theo@thornhill.no>
> Cc: casouri@gmail.com, emacs-devel@gnu.org, Stefan Monnier
>  <monnier@iro.umontreal.ca>
> Date: Sat, 12 Nov 2022 20:38:58 +0100
> 
> Unless some of you have any strong objections I think applying them to
> feature/tree-sitter now is a good time, to allow other people to more
> easily giving them a testrun.
> 
> What do you think?

Fine with me, thanks.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12 19:46         ` Stefan Kangas
@ 2022-11-12 20:03           ` Theodor Thornhill
  0 siblings, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-12 20:03 UTC (permalink / raw)
  To: Stefan Kangas, Eli Zaretskii; +Cc: casouri, emacs-devel, Stefan Monnier

Stefan Kangas <stefankangas@gmail.com> writes:

> Theodor Thornhill via "Emacs development discussions."
> <emacs-devel@gnu.org> writes:
>
>> See below patch.  Please note that I now have merged C and C++ into one
>> file, and both inherit from a new parent mode: 'c-ts-mode--base-mode'.
>> I simplified some of the indentation rules as well.
>
> Is it really a good idea to put C and C++ support in the same file?
>

Initially I thought no, but I have no strong opinions on this

> Also, the patch seems to be adding two files, but the commit message
> only mentions one of them?
>

You are correct.  Forgot to remove the old mode.  I'll fix it in
upcoming patch... Sorry, but thanks for the nice catch!

>> From 0d6e144b9a90cfd72f5cdc8480d5bbe70c281fc5 Mon Sep 17 00:00:00 2001
>> From: Theodor Thornhill <theo@thornhill.no>
>> Date: Thu, 10 Nov 2022 17:15:49 +0100
>> Subject: [PATCH] Add Tree Sitter modes for C-like languages
>>
>> * etc/NEWS: Mention the new modes
>> * lisp/progmodes/c-ts-mode.el: New major mode with Tree Sitter support
>> for C and C++.
>> * lisp/progmodes/java-ts-mode.el: New major mode with Tree Sitter support.
>> * lisp/progmodes/json-ts-mode.el: New major mode with Tree Sitter support.
>> * lisp/progmodes/css-ts-mode.el: New major mode with Tree Sitter support.
>> ---
>>  etc/NEWS                       |  32 ++-
>>  lisp/progmodes/c++-ts-mode.el  | 424 +++++++++++++++++++++++++++++++
>>  lisp/progmodes/c-ts-mode.el    | 445 +++++++++++++++++++++++++++++++++
>>  lisp/progmodes/css-ts-mode.el  | 131 ++++++++++
>>  lisp/progmodes/java-ts-mode.el | 289 +++++++++++++++++++++
>>  lisp/progmodes/json-ts-mode.el | 150 +++++++++++
>>  6 files changed, 1467 insertions(+), 4 deletions(-)
>>  create mode 100644 lisp/progmodes/c++-ts-mode.el
>>  create mode 100644 lisp/progmodes/c-ts-mode.el
>>  create mode 100644 lisp/progmodes/css-ts-mode.el
>>  create mode 100644 lisp/progmodes/java-ts-mode.el
>>  create mode 100644 lisp/progmodes/json-ts-mode.el
> [...]

-- 
Theo



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12 19:51         ` Eli Zaretskii
@ 2022-11-12 20:05           ` Theodor Thornhill via Emacs development discussions.
  2022-11-12 20:08             ` Yuan Fu
  0 siblings, 1 reply; 83+ messages in thread
From: Theodor Thornhill via Emacs development discussions. @ 2022-11-12 20:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, emacs-devel, monnier

[-- Attachment #1: Type: text/plain, Size: 523 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: casouri@gmail.com, emacs-devel@gnu.org, Stefan Monnier
>>  <monnier@iro.umontreal.ca>
>> Date: Sat, 12 Nov 2022 20:38:58 +0100
>> 
>> Unless some of you have any strong objections I think applying them to
>> feature/tree-sitter now is a good time, to allow other people to more
>> easily giving them a testrun.
>> 
>> What do you think?
>
> Fine with me, thanks.

Great!

See new patch here - following Stefans keen eye ;-)

Theo


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-Tree-Sitter-modes-for-C-like-languages.patch --]
[-- Type: text/x-diff, Size: 39727 bytes --]

From b2ee52f2b43bd7ddd89bb3c2057478e76c8f6dbe Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Thu, 10 Nov 2022 17:15:49 +0100
Subject: [PATCH] Add Tree Sitter modes for C-like languages

* etc/NEWS: Mention the new modes
* lisp/progmodes/c-ts-mode.el: New major mode with Tree Sitter support
for C and C++.
* lisp/progmodes/java-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/json-ts-mode.el: New major mode with Tree Sitter support.
* lisp/progmodes/css-ts-mode.el: New major mode with Tree Sitter support.
---
 etc/NEWS                       |  32 ++-
 lisp/progmodes/c-ts-mode.el    | 445 +++++++++++++++++++++++++++++++++
 lisp/progmodes/css-ts-mode.el  | 131 ++++++++++
 lisp/progmodes/java-ts-mode.el | 289 +++++++++++++++++++++
 lisp/progmodes/json-ts-mode.el | 150 +++++++++++
 5 files changed, 1043 insertions(+), 4 deletions(-)
 create mode 100644 lisp/progmodes/c-ts-mode.el
 create mode 100644 lisp/progmodes/css-ts-mode.el
 create mode 100644 lisp/progmodes/java-ts-mode.el
 create mode 100644 lisp/progmodes/json-ts-mode.el

diff --git a/etc/NEWS b/etc/NEWS
index 9ed78bc6b3..5e655a953d 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -2784,10 +2784,34 @@ when visiting JSON files.
 
 \f
 ** New mode ts-mode'.
-Support is added for TypeScript, based on the new integration with
-Tree-Sitter. There's support for font-locking, indentation and
-navigation.  Tree-Sitter is required for this mode to function, but if
-it is not available, we will default to use 'js-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the FOO language.  It includes support for font-locking,
+indentation, and navigation.
+
+** New mode c-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the C language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
+
+** New mode c++-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the C++ language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
+
+** New mode java-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the Java language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
+
+** New mode css-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the CSS language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
+
+** New mode json-ts-mode'.
+A major mode based on the Tree-sitter library for editing programs
+in the JSON language.  It includes support for font-locking,
+indentation, Imenu, which-func, and navigation.
 
 \f
 * Incompatible Lisp Changes in Emacs 29.1
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
new file mode 100644
index 0000000000..3e77a85b4d
--- /dev/null
+++ b/lisp/progmodes/c-ts-mode.el
@@ -0,0 +1,445 @@
+;;; c-ts-mode.el --- tree sitter support for C and C++  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : c c++ cpp languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom c-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `c-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'c)
+
+(defcustom c-ts-mode-indent-style 'gnu
+  "Style used for indentation.
+
+The selected style could be one of GNU, K&R, LINUX or BSD.  If
+one of the supplied styles doesn't suffice a function could be
+set instead.  This function is expected return a list that
+follows the form of `treesit-simple-indent-rules'."
+  :type '(choice (symbol :tag "Gnu" 'gnu)
+                 (symbol :tag "K&R" 'k&r)
+                 (symbol :tag "Linux" 'linux)
+                 (symbol :tag "BSD" 'bsd)
+                 (function :tag "A function for user customized style" ignore))
+  :group 'c)
+
+(defvar c-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    (modify-syntax-entry ?/  ". 124b" table)
+    (modify-syntax-entry ?*  ". 23"   table)
+    table)
+  "Syntax table for `c-ts-mode'.")
+
+(defun c-ts-mode--indent-styles (mode)
+  "Indent rules supported by `c-ts-mode'.
+MODE is either `c' or `cpp'."
+  (let ((common
+         `(((parent-is "translation_unit") parent-bol 0)
+           ((node-is ")") parent 1)
+           ((node-is "]") parent-bol 0)
+           ((node-is "}") (and parent parent-bol) 0)
+           ((node-is "else") parent-bol 0)
+           ((node-is "case") parent-bol 0)
+           ((node-is "preproc_arg") no-indent)
+           ((node-is "comment") no-indent)
+           ((parent-is "comment") no-indent)
+           ((node-is "labeled_statement") parent-bol 0)
+           ((parent-is "labeled_statement") parent-bol c-ts-mode-indent-offset)
+           ((match "preproc_ifdef" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_ifdef") point-min 0)
+           ((match "preproc_if" "compound_statement") point-min 0)
+           ((match "#endif" "preproc_if") point-min 0)
+           ((match "preproc_function_def" "compound_statement") point-min 0)
+           ((match "preproc_call" "compound_statement") point-min 0)
+           ((parent-is "compound_statement") (and parent parent-bol) c-ts-mode-indent-offset)
+           ((parent-is "function_definition") parent-bol 0)
+           ((parent-is "conditional_expression") first-sibling 0)
+           ((parent-is "assignment_expression") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "comma_expression") first-sibling 0)
+           ((parent-is "init_declarator") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "parenthesized_expression") first-sibling 1)
+           ((parent-is "argument_list") first-sibling 1)
+           ((parent-is "parameter_list") first-sibling 1)
+           ((parent-is "binary_expression") parent 0)
+           ((query "(for_statement initializer: (_) @indent)") parent-bol 5)
+           ((query "(for_statement condition: (_) @indent)") parent-bol 5)
+           ((query "(for_statement update: (_) @indent)") parent-bol 5)
+           ((query "(call_expression arguments: (_) @indent)") parent c-ts-mode-indent-offset)
+           ((parent-is "call_expression") parent 0)
+           ((parent-is "enumerator_list") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "field_declaration_list") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "initializer_list") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "if_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "for_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "while_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "switch_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "case_statement") parent-bol c-ts-mode-indent-offset)
+           ((parent-is "do_statement") parent-bol c-ts-mode-indent-offset)
+           ,@(when (eq mode 'cpp)
+               `(((node-is "field_initializer_list") parent-bol ,(* c-ts-mode-indent-offset 2)))))))
+    `((gnu
+       ;; Prepend rules to set highest priority
+       ((match "while" "do_statement") parent 0)
+       ,@common)
+      (k&r ,@common)
+      (linux ,@common)
+      (bsd
+       ((parent-is "if_statement") parent-bol 0)
+       ((parent-is "for_statement") parent-bol 0)
+       ((parent-is "while_statement") parent-bol 0)
+       ((parent-is "switch_statement") parent-bol 0)
+       ((parent-is "case_statement") parent-bol 0)
+       ((parent-is "do_statement") parent-bol 0)
+       ,@common))))
+
+(defun c-ts-mode--set-indent-style (mode)
+  "Helper function to set indentation style.
+MODE is either `c' or `cpp'."
+  (let ((style
+         (if (functionp c-ts-mode-indent-style)
+             (funcall c-ts-mode-indent-style)
+           (pcase c-ts-mode-indent-style
+             ('gnu   (alist-get 'gnu (c-ts-mode--indent-styles mode)))
+             ('k&r   (alist-get 'k&r (c-ts-mode--indent-styles mode)))
+             ('bsd   (alist-get 'bsd (c-ts-mode--indent-styles mode)))
+             ('linux (alist-get 'linux (c-ts-mode--indent-styles mode)))))))
+    `((,mode ,@style))))
+
+(defvar c-ts-mode--preproc-keywords
+  '("#define" "#if" "#ifdef" "#ifndef"
+    "#else" "#elif" "#endif" "#include")
+  "C/C++ keywords for tree-sitter font-locking.")
+
+(defun c-ts-mode--keywords (mode)
+  "C/C++ keywords for tree-sitter font-locking.
+MODE is either `c' or `cpp'."
+  (let ((c-keywords
+         '("break" "case" "const" "continue"
+           "default" "do" "else" "enum"
+           "extern" "for" "goto" "if"
+           "long" "register" "return" "short"
+           "signed" "sizeof" "static" "struct"
+           "switch" "typedef" "union" "unsigned"
+           "volatile" "while")))
+    (if (eq mode 'cpp)
+        (append c-keywords
+                '("and" "and_eq" "bitand" "bitor"
+                  "catch" "class" "co_await" "co_return"
+                  "co_yield" "compl" "concept" "consteval"
+                  "constexpr" "constinit" "decltype" "delete"
+                  "explicit" "final" "friend" "friend"
+                  "mutable" "namespace" "new" "noexcept"
+                  "not" "not_eq" "operator" "or"
+                  "or_eq" "override" "private" "protected"
+                  "public" "requires" "template" "throw"
+                  "try" "typename" "using" "virtual"
+                  "xor" "xor_eq"))
+      (append '("auto") c-keywords))))
+
+(defvar c-ts-mode--operators
+  '("=" "-" "*" "/" "+" "%" "~" "|" "&" "^" "<<" ">>" "->"
+    "." "<" "<=" ">=" ">" "==" "!=" "!" "&&" "||" "-="
+    "+=" "*=" "/=" "%=" "|=" "&=" "^=" ">>=" "<<=" "--" "++")
+  "C/C++ operators for tree-sitter font-locking.")
+
+(defun c-ts-mode--font-lock-settings (mode)
+  "Tree-sitter font-lock settings.
+MODE is either `c' or `cpp'."
+  (treesit-font-lock-rules
+   :language mode
+   :override t
+   :feature 'comment
+   `((comment) @font-lock-comment-face
+     (comment) @contexual)
+   :language mode
+   :override t
+   :feature 'preprocessor
+   `((preproc_directive) @font-lock-preprocessor-face
+
+     (preproc_def
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_ifdef
+      name: (identifier) @font-lock-variable-name-face)
+
+     (preproc_function_def
+      name: (identifier) @font-lock-function-name-face)
+
+     (preproc_params
+      (identifier) @font-lock-variable-name-face)
+
+     (preproc_defined) @font-lock-preprocessor-face
+     (preproc_defined (identifier) @font-lock-variable-name-face)
+     [,@c-ts-mode--preproc-keywords] @font-lock-preprocessor-face)
+   :language mode
+   :override t
+   :feature 'constant
+   `((true) @font-lock-constant-face
+     (false) @font-lock-constant-face
+     (null) @font-lock-constant-face
+     ,@(when (eq mode 'cpp)
+         '((this) @font-lock-constant-face)))
+   :language mode
+   :override t
+   :feature 'keyword
+   `([,@(c-ts-mode--keywords mode)] @font-lock-keyword-face
+     ,@(when (eq mode 'cpp)
+         '((auto) @font-lock-keyword-face)))
+   :language mode
+   :override t
+   :feature 'operator
+   `([,@c-ts-mode--operators] @font-lock-builtin-face)
+   :language mode
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face
+     ((string_literal)) @contextual
+     (system_lib_string) @font-lock-string-face
+     (escape_sequence) @font-lock-string-face)
+   :language mode
+   :override t
+   :feature 'literal
+   `((number_literal) @font-lock-constant-face
+     (char_literal) @font-lock-constant-face)
+   :language mode
+   :override t
+   :feature 'type
+   `((primitive_type) @font-lock-type-face
+     ,@(when (eq mode 'cpp)
+         '((type_qualifier) @font-lock-type-face
+
+           (qualified_identifier
+            scope: (namespace_identifier) @font-lock-type-face)
+
+           (operator_cast) type: (type_identifier) @font-lock-type-face)))
+   :language mode
+   :override t
+   :feature 'definition
+   `((declaration
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (field_declaration
+      declarator: (field_identifier) @font-lock-variable-name-face)
+
+     (field_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (parameter_declaration
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_definition
+      type: (type_identifier) @font-lock-type-face)
+
+     (function_declarator
+      declarator: (identifier) @font-lock-function-name-face)
+
+     (array_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (init_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (struct_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (sized_type_specifier) @font-lock-type-face
+
+     (enum_specifier
+      name: (type_identifier) @font-lock-type-face)
+
+     (enumerator
+      name: (identifier) @font-lock-variable-name-face)
+
+     (parameter_declaration
+      type: (_) @font-lock-type-face
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (identifier) @font-lock-variable-name-face)
+
+     (pointer_declarator
+      declarator: (field_identifier) @font-lock-variable-name-face))
+   :language mode
+   :override t
+   :feature 'expression
+   '((assignment_expression
+      left: (identifier) @font-lock-variable-name-face)
+
+     (call_expression
+      function: (identifier) @font-lock-function-name-face)
+
+     (field_expression
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (field_expression
+      argument: (identifier) @font-lock-variable-name-face
+      field: (field_identifier) @font-lock-variable-name-face)
+
+     (pointer_expression
+      argument: (identifier) @font-lock-variable-name-face))
+   :language mode
+   :override t
+   :feature 'statement
+   '((expression_statement (identifier) @font-lock-variable-name-face)
+     (labeled_statement
+      label: (statement_identifier) @font-lock-type-face))
+   :language mode
+   :override t
+   :feature 'error
+   '((ERROR) @font-lock-warning-face)))
+
+(defun c-ts-mode--imenu-1 (node)
+  "Helper for `c-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'c-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "declarator")
+                          (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    ;; A struct_specifier could be inside a parameter list or another
+    ;; struct definition.  In those cases we don't include it.
+    (cond
+     ((string-match-p
+       (rx (or "parameter" "field") "_declaration")
+       (or (treesit-node-type (treesit-node-parent ts-node))
+           ""))
+      nil)
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun c-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "function_definition"
+                             "struct_specifier")))))
+    (c-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode c-ts-mode--base-mode prog-mode "C"
+  "Major mode for editing C, powered by Tree Sitter."
+  :group 'c
+  :syntax-table c-ts-mode--syntax-table
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "specifier"
+                      "definition")))
+
+  ;; Indent.
+  (when (eq c-ts-mode-indent-style 'linux)
+    (setq-local indent-tabs-mode t))
+
+  ;; Electric
+  (setq-local electric-indent-chars
+	      (append "{}():;," electric-indent-chars))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'c-ts-mode--imenu)
+  (setq-local which-func-functions nil)
+
+  (setq-local treesit-font-lock-feature-list
+              '((comment preprocessor operator constant string literal keyword)
+                (type definition expression statement)
+                (error))))
+
+;;;###autoload
+(define-derived-mode c-ts-mode c-ts-mode--base-mode "C"
+  "Major mode for editing C, powered by Tree Sitter."
+  :group 'c
+
+  (unless (treesit-ready-p nil 'c)
+    (error "Tree Sitter for C isn't available"))
+
+  (treesit-parser-create 'c)
+
+  (setq-local treesit-simple-indent-rules
+              (c-ts-mode--set-indent-style 'c))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings (c-ts-mode--font-lock-settings 'c))
+
+  (treesit-major-mode-setup))
+
+;;;###autoload
+(define-derived-mode c++-ts-mode c-ts-mode--base-mode "C++"
+  "Major mode for editing C, powered by Tree Sitter."
+  :group 'c++
+
+  (unless (treesit-ready-p nil 'cpp)
+    (error "Tree Sitter for C++ isn't available"))
+
+  (treesit-parser-create 'cpp)
+
+  (setq-local treesit-simple-indent-rules
+              (c-ts-mode--set-indent-style 'cpp))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings (c-ts-mode--font-lock-settings 'cpp))
+
+  (treesit-major-mode-setup))
+
+(provide 'c-ts-mode)
+
+;;; c-ts-mode.el ends here
diff --git a/lisp/progmodes/css-ts-mode.el b/lisp/progmodes/css-ts-mode.el
new file mode 100644
index 0000000000..c1a8d4e94d
--- /dev/null
+++ b/lisp/progmodes/css-ts-mode.el
@@ -0,0 +1,131 @@
+;;; css-ts-mode.el --- tree sitter support for CSS  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : css languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+(require 'css-mode)
+
+(defcustom css-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'css)
+
+(defvar css-ts-mode--indent-rules
+  `((css
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+
+     ((parent-is "block") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "arguments") parent-bol css-ts-mode-indent-offset)
+     ((parent-is "declaration") parent-bol css-ts-mode-indent-offset))))
+
+(defvar css-ts-mode--settings
+  (treesit-font-lock-rules
+   :language 'css
+   :feature 'basic
+   :override t
+   `((unit) @font-lock-constant-face
+     (integer_value) @font-lock-builtin-face
+     (float_value) @font-lock-builtin-face
+     (plain_value) @font-lock-variable-name-face
+     (comment) @font-lock-comment-face
+     (class_selector) @css-selector
+     (child_selector) @css-selector
+     (id_selector) @css-selector
+     (tag_name) @css-selector
+     (property_name) @css-property
+     (class_name) @css-selector
+     (function_name) @font-lock-function-name-face)))
+
+(defun css-ts-mode--imenu-1 (node)
+  "Helper for `css-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'css-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (if (equal (treesit-node-type ts-node) "tag_name")
+                     (treesit-node-text ts-node)
+                   (treesit-node-text (treesit-node-child ts-node 1) t))))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun css-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_selector"
+                             "id_selector"
+                             "tag_name")))))
+    (css-ts-mode--imenu-1 tree)))
+
+(define-derived-mode css-ts-mode prog-mode "CSS"
+  "Major mode for editing CSS."
+  :group 'css
+  :syntax-table css-mode-syntax-table
+
+  (unless (treesit-ready-p nil 'css)
+    (error "Tree Sitter for CSS isn't available"))
+
+  (treesit-parser-create 'css)
+
+  ;; Comments
+  (setq-local comment-start "/*")
+  (setq-local comment-start-skip "/\\*+[ \t]*")
+  (setq-local comment-end "*/")
+  (setq-local comment-end-skip "[ \t]*\\*+/")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules css-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp "rule_set")
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings css-ts-mode--settings)
+  (setq treesit-font-lock-feature-list '((basic) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'css-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'css-ts-mode)
+
+;;; css-ts-mode.el ends here
diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
new file mode 100644
index 0000000000..734a8be471
--- /dev/null
+++ b/lisp/progmodes/java-ts-mode.el
@@ -0,0 +1,289 @@
+;;; java-ts-mode.el --- tree sitter support for Java  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : java languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom java-ts-mode-indent-offset 4
+  "Number of spaces for each indentation step in `java-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'java)
+
+(defvar java-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `java-ts-mode'.")
+
+(defvar java-ts-mode--indent-rules
+  `((java
+     ((parent-is "program") parent-bol 0)
+     ((node-is "}") (and parent parent-bol) 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "class_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "interface_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "constructor_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "enum_body") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_block") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "record_declaration_body") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block _ @indent))") parent-bol java-ts-mode-indent-offset)
+     ((query "(method_declaration (block (_) @indent))") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "variable_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "method_invocation") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_rule") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "ternary_expression") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "element_value_array_initializer") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "function_definition") parent-bol 0)
+     ((parent-is "conditional_expression") first-sibling 0)
+     ((parent-is "assignment_expression") parent-bol 2)
+     ((parent-is "binary_expression") parent 0)
+     ((parent-is "parenthesized_expression") first-sibling 1)
+     ((parent-is "argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "annotation_argument_list") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "modifiers") parent-bol 0)
+     ((parent-is "formal_parameters") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "formal_parameter") parent-bol 0)
+     ((parent-is "init_declarator") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "if_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "for_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "while_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "switch_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "case_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "labeled_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "do_statement") parent-bol java-ts-mode-indent-offset)
+     ((parent-is "block") (and parent parent-bol) java-ts-mode-indent-offset)))
+  "Tree-sitter indent rules.")
+
+(defvar java-ts-mode--keywords
+  '("abstract" "assert" "break" "case" "catch"
+    "class" "continue" "default" "do" "else"
+    "enum" "exports" "extends" "final" "finally"
+    "for" "if" "implements" "import" "instanceof"
+    "interface" "module" "native" "new" "non-sealed"
+    "open" "opens" "package" "private" "protected"
+    "provides" "public" "requires" "return" "sealed"
+    "static" "strictfp" "switch" "synchronized"
+    "throw" "throws" "to" "transient" "transitive"
+    "try" "uses" "volatile" "while" "with" "record")
+  "C keywords for tree-sitter font-locking.")
+
+(defvar java-ts-mode--operators
+  '("@" "+" ":" "++" "-" "--" "&" "&&" "|" "||"
+    "!=" "==" "*" "/" "%" "<" "<=" ">" ">=" "="
+    "-=" "+=" "*=" "/=" "%=" "->" "^" "^=" "&="
+    "|=" "~" ">>" ">>>" "<<" "::" "?")
+  "C operators for tree-sitter font-locking.")
+
+(defvar java-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'java
+   :override t
+   :feature 'basic
+   '((identifier) @font-lock-variable-name-face)
+   :language 'java
+   :override t
+   :feature 'comment
+   `((line_comment) @font-lock-comment-face
+     (block_comment) @font-lock-comment-face)
+   :language 'java
+   :override t
+   :feature 'constant
+   `(((identifier) @font-lock-constant-face
+      (:match "^[A-Z_][A-Z_\\d]*$" @font-lock-constant-face))
+     (true) @font-lock-constant-face
+     (false) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'keyword
+   `([,@java-ts-mode--keywords] @font-lock-keyword-face
+     (labeled_statement
+      (identifier) @font-lock-keyword-face))
+   :language 'java
+   :override t
+   :feature 'operator
+   `([,@java-ts-mode--operators] @font-lock-builtin-face)
+   :language 'java
+   :override t
+   :feature 'annotation
+   `((annotation
+      name: (identifier) @font-lock-constant-face)
+
+     (marker_annotation
+      name: (identifier) @font-lock-constant-face))
+   :language 'java
+   :override t
+   :feature 'string
+   `((string_literal) @font-lock-string-face)
+   :language 'java
+   :override t
+   :feature 'literal
+   `((null_literal) @font-lock-constant-face
+     (decimal_floating_point_literal)  @font-lock-constant-face
+     (hex_floating_point_literal) @font-lock-constant-face)
+   :language 'java
+   :override t
+   :feature 'type
+   '((interface_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (class_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (record_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (enum_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (constructor_declaration
+      name: (identifier) @font-lock-type-face)
+
+     (field_access
+      object: (identifier) @font-lock-type-face)
+
+     (method_reference (identifier) @font-lock-type-face)
+
+     ((scoped_identifier name: (identifier) @font-lock-type-face)
+      (:match "^[A-Z]" @font-lock-type-face))
+
+     (type_identifier) @font-lock-type-face
+
+     [(boolean_type)
+      (integral_type)
+      (floating_point_type)
+      (void_type)] @font-lock-type-face)
+   :language 'java
+   :override t
+   :feature 'definition
+   `((method_declaration
+      name: (identifier) @font-lock-function-name-face)
+
+     (formal_parameter
+      name: (identifier) @font-lock-variable-name-face)
+
+     (catch_formal_parameter
+      name: (identifier) @font-lock-variable-name-face))
+   :language 'java
+   :override t
+   :feature 'expression
+   '((method_invocation
+      object: (identifier) @font-lock-variable-name-face)
+
+     (method_invocation
+      name: (identifier) @font-lock-function-name-face)
+
+     (argument_list (identifier) @font-lock-variable-name-face)))
+  "Tree-sitter font-lock settings.")
+
+(defun java-ts-mode--imenu-1 (node)
+  "Helper for `java-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'java-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (or (treesit-node-text
+                      (or (treesit-node-child-by-field-name
+                           ts-node "name"))
+                      t)
+                     "Unnamed node")))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun java-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node (rx (or "class_declaration"
+                             "interface_declaration"
+                             "enum_declaration"
+                             "record_declaration"
+                             "method_declaration")))))
+    (java-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode java-ts-mode prog-mode "Java"
+  "Major mode for editing Java, powered by Tree Sitter."
+  :group 'c
+  :syntax-table java-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'java)
+    (error "Tree-sitter for Java isn't available"))
+
+  (treesit-parser-create 'java)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules java-ts-mode--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "declaration")))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings java-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((basic comment keyword constant string operator)
+                (type definition expression literal annotation)
+                ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'java-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+  (treesit-major-mode-setup))
+
+(provide 'java-ts-mode)
+
+;;; java-ts-mode.el ends here
diff --git a/lisp/progmodes/json-ts-mode.el b/lisp/progmodes/json-ts-mode.el
new file mode 100644
index 0000000000..13eb5b78a9
--- /dev/null
+++ b/lisp/progmodes/json-ts-mode.el
@@ -0,0 +1,150 @@
+;;; json-ts-mode.el --- tree sitter support for JSON  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Theodor Thornhill <theo@thornhill.no>
+;; Maintainer : Theodor Thornhill <theo@thornhill.no>
+;; Created    : November 2022
+;; Keywords   : json languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+(require 'rx)
+
+(defcustom json-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `json-ts-mode'."
+  :type 'integer
+  :safe 'integerp
+  :group 'json)
+
+(defvar json-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    ;; Taken from the cc-langs version
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?$ "_"      table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?+  "."     table)
+    (modify-syntax-entry ?-  "."     table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?%  "."     table)
+    (modify-syntax-entry ?<  "."     table)
+    (modify-syntax-entry ?>  "."     table)
+    (modify-syntax-entry ?&  "."     table)
+    (modify-syntax-entry ?|  "."     table)
+    (modify-syntax-entry ?` "\""     table)
+    (modify-syntax-entry ?\240 "."   table)
+    table)
+  "Syntax table for `json-ts-mode'.")
+
+
+(defvar json-ts--indent-rules
+  `((json
+     ((node-is "}") parent-bol 0)
+     ((node-is ")") parent-bol 0)
+     ((node-is "]") parent-bol 0)
+     ((parent-is "object") parent-bol json-ts-mode-indent-offset))))
+
+(defvar json-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'json
+   :feature 'minimal
+   :override t
+   `((pair
+      key: (_) @font-lock-string-face)
+
+     (string) @font-lock-string-face
+
+     (number) @font-lock-constant-face
+
+     [(null) (true) (false)] @font-lock-constant-face
+
+     (escape_sequence) @font-lock-constant-face
+
+     (comment) @font-lock-comment-face))
+  "Font-lock settings for JSON.")
+
+(defun json-ts-mode--imenu-1 (node)
+  "Helper for `json-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'json-ts-mode--imenu-1 (cdr node)))
+         (name (when ts-node
+                 (treesit-node-text
+                  (treesit-node-child-by-field-name
+                   ts-node "key")
+                  t)))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun json-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (tree (treesit-induce-sparse-tree
+                node "pair")))
+    (json-ts-mode--imenu-1 tree)))
+
+;;;###autoload
+(define-derived-mode json-ts-mode prog-mode "JSON"
+  "Major mode for editing JSON, powered by Tree Sitter."
+  :group 'json
+  :syntax-table json-ts-mode--syntax-table
+
+  (unless (treesit-ready-p nil 'json)
+    (error "Tree Sitter for JSON isn't available"))
+
+  (treesit-parser-create 'json)
+
+  ;; Comments.
+  (setq-local comment-start "// ")
+  (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *")
+  (setq-local comment-end "")
+
+  ;; Indent.
+  (setq-local treesit-simple-indent-rules json-ts--indent-rules)
+
+  ;; Navigation.
+  (setq-local treesit-defun-type-regexp
+              (rx (or "pair" "object")))
+
+  ;; Font-lock.
+  (setq-local treesit-font-lock-settings json-ts-mode--font-lock-settings)
+  (setq-local treesit-font-lock-feature-list
+              '((minimal) () ()))
+
+  ;; Imenu.
+  (setq-local imenu-create-index-function #'json-ts-mode--imenu)
+  (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+  (treesit-major-mode-setup))
+
+(provide 'json-ts-mode)
+
+;;; json-ts-mode.el ends here
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12 20:05           ` Theodor Thornhill via Emacs development discussions.
@ 2022-11-12 20:08             ` Yuan Fu
  2022-11-12 20:14               ` Theodor Thornhill
  0 siblings, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-12 20:08 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: Eli Zaretskii, emacs-devel, monnier



> On Nov 12, 2022, at 12:05 PM, Theodor Thornhill <theo@thornhill.no> wrote:
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
>>> From: Theodor Thornhill <theo@thornhill.no>
>>> Cc: casouri@gmail.com, emacs-devel@gnu.org, Stefan Monnier
>>> <monnier@iro.umontreal.ca>
>>> Date: Sat, 12 Nov 2022 20:38:58 +0100
>>> 
>>> Unless some of you have any strong objections I think applying them to
>>> feature/tree-sitter now is a good time, to allow other people to more
>>> easily giving them a testrun.
>>> 
>>> What do you think?
>> 
>> Fine with me, thanks.
> 
> Great!
> 
> See new patch here - following Stefans keen eye ;-)

Applied and pushed, thanks ;-)

Yuan



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12 20:08             ` Yuan Fu
@ 2022-11-12 20:14               ` Theodor Thornhill
  2022-11-13  9:13                 ` Eli Zaretskii
  0 siblings, 1 reply; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-12 20:14 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, emacs-devel, monnier

Yuan Fu <casouri@gmail.com> writes:

>> On Nov 12, 2022, at 12:05 PM, Theodor Thornhill <theo@thornhill.no> wrote:
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>>>> From: Theodor Thornhill <theo@thornhill.no>
>>>> Cc: casouri@gmail.com, emacs-devel@gnu.org, Stefan Monnier
>>>> <monnier@iro.umontreal.ca>
>>>> Date: Sat, 12 Nov 2022 20:38:58 +0100
>>>> 
>>>> Unless some of you have any strong objections I think applying them to
>>>> feature/tree-sitter now is a good time, to allow other people to more
>>>> easily giving them a testrun.
>>>> 
>>>> What do you think?
>>> 
>>> Fine with me, thanks.
>> 
>> Great!
>> 
>> See new patch here - following Stefans keen eye ;-)
>
> Applied and pushed, thanks ;-)

Great news!  Thanks, all!

-- 
Theo



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-11 15:54     ` Randy Taylor
@ 2022-11-13  8:37       ` Theodor Thornhill
  2022-11-13 13:03         ` Randy Taylor
  0 siblings, 1 reply; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-13  8:37 UTC (permalink / raw)
  To: Randy Taylor; +Cc: emacs-devel, casouri



On 11 November 2022 16:54:53 CET, Randy Taylor <dev@rjt.dev> wrote:
>On Friday, November 11th, 2022 at 00:50, Theodor Thornhill <theo@thornhill.no> wrote:
>>
>> Yeah, I'm interested in reducing duplication, but not for that reason only. But I'm thinking of ways to make one inherit the other.
>> 
>> How about we first get this merged, then new faces on top of that?
>
>Sounds good.

Bring the faces :-)



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-12 20:14               ` Theodor Thornhill
@ 2022-11-13  9:13                 ` Eli Zaretskii
  2022-11-13  9:40                   ` Theodor Thornhill
  0 siblings, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-13  9:13 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: casouri, emacs-devel, monnier

> From: Theodor Thornhill <theo@thornhill.no>
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org, monnier@iro.umontreal.ca
> Date: Sat, 12 Nov 2022 21:14:21 +0100
> 
> Yuan Fu <casouri@gmail.com> writes:
> 
> >> See new patch here - following Stefans keen eye ;-)
> >
> > Applied and pushed, thanks ;-)
> 
> Great news!  Thanks, all!

Thanks.  The new C mode looks good, but I have a couple of issues with
it.

First, something strange is going on when I type new code.  Here's a
recipe:

   emacs -Q
   C-x C-f newfile.c RET
   M-x c-ts-mode RET
   Type:

int
foo (void)
{

At this point, "int" is in font-lock-warning-face -- why?

Next, with point after the brace, type RET -- this doesn't indent 2
spaces, as I'd expect -- why?  Typing TAB to indent doesn't help,
either.

I then type "int bar = 0;".  Typing RET after that doesn't indent,
either.

But if I add an empty line at BOB, the fontification becomes as
expected, and doesn't go back to font-lock-warning-face even if I then
remove that empty line.

Type } to close the function.  I now have this:

int
foo (void)
{
  int bar = 0;
}

But "int" is still in font-lock-warning-face -- why?

Next, I type this:

struct foo {
  int bar;
};

The result is that all of the struct, except the closing brace, is in
font-lock-warning-face -- why?  Again, adding an empty line before
that fixes fontifications, and the fontification stays correct even
after removing that empty line.

If I type

struct bar
  {
    int foo;
  };

then the opening brace and "int foo;" are in font-lock-warning-face.

Next, if I type M-;, I get a C++-style comment delimiter "//".  It
sounds like this is the only style of comments supported?  More
generally, if I compare c-basic-common-init and c-common-init from CC
Mode with c-ts-mode, I see that the former has much more
initializations than the latter.  So I think we should audit what CC
Mode does here and see what else is relevant.  Alternatively, we could
consider c-ts-mode be a minor mode of CC Mode, which only changes the
fontification, the indentation, and the navigation parts.

Thanks.

P.S. If these problems are non-trivial, it might be best to file a bug
report for each one.  But the last issue, the one about doing more
stuff like CC Mode does, is something we should discuss here, I think,
since this is basic design, and similar issues could exist for other
modes whose *-ts-mode variants were installed on the branch.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13  9:13                 ` Eli Zaretskii
@ 2022-11-13  9:40                   ` Theodor Thornhill
  2022-11-13  9:56                     ` Eli Zaretskii
  0 siblings, 1 reply; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-13  9:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, emacs-devel, monnier

[-- Attachment #1: Type: text/plain, Size: 4255 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org, monnier@iro.umontreal.ca
>> Date: Sat, 12 Nov 2022 21:14:21 +0100
>> 
>> Yuan Fu <casouri@gmail.com> writes:
>> 
>> >> See new patch here - following Stefans keen eye ;-)
>> >
>> > Applied and pushed, thanks ;-)
>> 
>> Great news!  Thanks, all!
>
> Thanks.  The new C mode looks good, but I have a couple of issues with
> it.
>

Great - thanks for looking.  I actually have answers too!

> First, something strange is going on when I type new code.  Here's a
> recipe:
>
>    emacs -Q
>    C-x C-f newfile.c RET
>    M-x c-ts-mode RET
>    Type:
>
> int
> foo (void)
> {
>
> At this point, "int" is in font-lock-warning-face -- why?
>

If you enable 'treesit-inspect-mode' and put point on 'int', you will
see it report the 'ERROR' node.  This node is font locked like that
because of the font lock rule I added for that case.  I think we can
remove it, but it does serve some useful purpose.


> Next, with point after the brace, type RET -- this doesn't indent 2
> spaces, as I'd expect -- why?  Typing TAB to indent doesn't help,
> either.
>

This is because tree-sitter doesn't know what to do with it. if you
rather type:

```
int
foo (void)
{}
```

It will know that it has a complete node and indent accordingly if you
press RET while inside the braces.

       (no-node parent-bol c-ts-mode-indent-offset)

Now this indentation should happen as you want, even though we are in an
error state syntax-wise.  At least after you do what you state just below


> I then type "int bar = 0;".  Typing RET after that doesn't indent,
> either.
>

This is for the same reason.  Adding the closing brace would fix that,
or the rule I mentioned.  Now my code is indented like this:

```
int
foo ()
{
  int bar = 0;
```

> But if I add an empty line at BOB, the fontification becomes as
> expected, and doesn't go back to font-lock-warning-face even if I then
> remove that empty line.
>

This is likely due to either treesit or tree-sitter or tree-sitter-c not
dealing properly with the root node.  Maybe Yuan has some insight here?

> Type } to close the function.  I now have this:
>
> int
> foo (void)
> {
>   int bar = 0;
> }
>
> But "int" is still in font-lock-warning-face -- why?
>

I think the best solution is just to remove the

```
   :language mode
   :override t
   :feature 'error
   '((ERROR) @font-lock-warning-face)
```

> Next, I type this:
>
> struct foo {
>   int bar;
> };
>
> The result is that all of the struct, except the closing brace, is in
> font-lock-warning-face -- why?  Again, adding an empty line before
> that fixes fontifications, and the fontification stays correct even
> after removing that empty line.
>
> If I type
>
> struct bar
>   {
>     int foo;
>   };
>

Same thing.  Let's just remove it.  I'll add a patch below, feel free to
install it.

> then the opening brace and "int foo;" are in font-lock-warning-face.
>
> Next, if I type M-;, I get a C++-style comment delimiter "//".  It
> sounds like this is the only style of comments supported?  More
> generally, if I compare c-basic-common-init and c-common-init from CC
> Mode with c-ts-mode, I see that the former has much more
> initializations than the latter.  So I think we should audit what CC
> Mode does here and see what else is relevant.  Alternatively, we could
> consider c-ts-mode be a minor mode of CC Mode, which only changes the
> fontification, the indentation, and the navigation parts.
>

I can take a look at that this evening - and see what else I can come up
with.  I agree with the comment style

> Thanks.
>
> P.S. If these problems are non-trivial, it might be best to file a bug
> report for each one.  But the last issue, the one about doing more
> stuff like CC Mode does, is something we should discuss here, I think,
> since this is basic design, and similar issues could exist for other
> modes whose *-ts-mode variants were installed on the branch.

Your issues are two-fold.  The warning face is super easy, but the
indenting of error nodes may need a change of perspective.  Tree-sitter
works best when syntax is correct, even though it handles errors pretty
well.

See patch


Theo



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Remove-error-node-font-locking.patch --]
[-- Type: text/x-diff, Size: 1429 bytes --]

From 8a21833d36239ed61d808064faa78d19d6fc5517 Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Sun, 13 Nov 2022 10:39:56 +0100
Subject: [PATCH] Remove error node font locking

* lisp/progmodes/c-ts-mode.el (c-ts-mode--font-lock-settings)
(c-ts-mode--base-mode): Error node font locking causes too much noise.
---
 lisp/progmodes/c-ts-mode.el | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index 5617ea7d7c..7e7b554943 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -326,11 +326,7 @@ c-ts-mode--font-lock-settings
    :feature 'statement
    '((expression_statement (identifier) @font-lock-variable-name-face)
      (labeled_statement
-      label: (statement_identifier) @font-lock-type-face))
-   :language mode
-   :override t
-   :feature 'error
-   '((ERROR) @font-lock-warning-face)))
+      label: (statement_identifier) @font-lock-type-face))))
 
 (defun c-ts-mode--imenu-1 (node)
   "Helper for `c-ts-mode--imenu'.
@@ -424,7 +420,7 @@ c-ts-mode--base-mode
   (setq-local treesit-font-lock-feature-list
               '((comment preprocessor operator constant string literal keyword)
                 (type definition expression statement)
-                (error))))
+                ())))
 
 ;;;###autoload
 (define-derived-mode c-ts-mode c-ts-mode--base-mode "C"
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13  9:40                   ` Theodor Thornhill
@ 2022-11-13  9:56                     ` Eli Zaretskii
  2022-11-13 10:13                       ` Theodor Thornhill
  2022-11-14  0:22                       ` Yuan Fu
  0 siblings, 2 replies; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-13  9:56 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: casouri, emacs-devel, monnier

> From: Theodor Thornhill <theo@thornhill.no>
> Cc: casouri@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca
> Date: Sun, 13 Nov 2022 10:40:26 +0100
> 
> > But if I add an empty line at BOB, the fontification becomes as
> > expected, and doesn't go back to font-lock-warning-face even if I then
> > remove that empty line.
> >
> 
> This is likely due to either treesit or tree-sitter or tree-sitter-c not
> dealing properly with the root node.  Maybe Yuan has some insight here?

This sounds like we don't update tree-sitter under some conditions,
IOW a bug of sorts.

> I think the best solution is just to remove the
> 
> ```
>    :language mode
>    :override t
>    :feature 'error
>    '((ERROR) @font-lock-warning-face)
> ```

What are the downsides of removing this? what will we lose?

> > Next, if I type M-;, I get a C++-style comment delimiter "//".  It
> > sounds like this is the only style of comments supported?  More
> > generally, if I compare c-basic-common-init and c-common-init from CC
> > Mode with c-ts-mode, I see that the former has much more
> > initializations than the latter.  So I think we should audit what CC
> > Mode does here and see what else is relevant.  Alternatively, we could
> > consider c-ts-mode be a minor mode of CC Mode, which only changes the
> > fontification, the indentation, and the navigation parts.
> >
> 
> I can take a look at that this evening - and see what else I can come up
> with.  I agree with the comment style

Thanks.

> Your issues are two-fold.  The warning face is super easy, but the
> indenting of error nodes may need a change of perspective.  Tree-sitter
> works best when syntax is correct, even though it handles errors pretty
> well.

The mode must do something sensible when code is incomplete, and thus
"incorrect".  At the very least the fontification and indentation
should become fixed once the code becomes complete/correct, and that
is not what happens as things are now.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13  9:56                     ` Eli Zaretskii
@ 2022-11-13 10:13                       ` Theodor Thornhill
  2022-11-13 12:55                         ` Eli Zaretskii
  2022-11-14  0:22                       ` Yuan Fu
  1 sibling, 1 reply; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-13 10:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, emacs-devel, monnier



On 13 November 2022 10:56:13 CET, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: casouri@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca
>> Date: Sun, 13 Nov 2022 10:40:26 +0100
>> 
>> > But if I add an empty line at BOB, the fontification becomes as
>> > expected, and doesn't go back to font-lock-warning-face even if I then
>> > remove that empty line.
>> >
>> 
>> This is likely due to either treesit or tree-sitter or tree-sitter-c not
>> dealing properly with the root node.  Maybe Yuan has some insight here?
>
>This sounds like we don't update tree-sitter under some conditions,
>IOW a bug of sorts.
>
>> I think the best solution is just to remove the
>> 
>> ```
>>    :language mode
>>    :override t
>>    :feature 'error
>>    '((ERROR) @font-lock-warning-face)
>> ```
>
>What are the downsides of removing this? what will we lose?
>

Absolutely nothing. Only the yellow color.

>> > Next, if I type M-;, I get a C++-style comment delimiter "//".  It
>> > sounds like this is the only style of comments supported?  More
>> > generally, if I compare c-basic-common-init and c-common-init from CC
>> > Mode with c-ts-mode, I see that the former has much more
>> > initializations than the latter.  So I think we should audit what CC
>> > Mode does here and see what else is relevant.  Alternatively, we could
>> > consider c-ts-mode be a minor mode of CC Mode, which only changes the
>> > fontification, the indentation, and the navigation parts.
>> >
>> 
>> I can take a look at that this evening - and see what else I can come up
>> with.  I agree with the comment style
>
>Thanks.
>
>> Your issues are two-fold.  The warning face is super easy, but the
>> indenting of error nodes may need a change of perspective.  Tree-sitter
>> works best when syntax is correct, even though it handles errors pretty
>> well.
>
>The mode must do something sensible when code is incomplete, and thus
>"incorrect".  At the very least the fontification and indentation
>should become fixed once the code becomes complete/correct, and that
>is not what happens as things are now.

Yes. By applying that patch we are 98% there. I'll tweak it tonight without auto-closing-brace enabled. I use them, so my choices are based on that.

Just apply the patch in the meantime, and test some more if your time permits :-)

Thanks,
Theo



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13 10:13                       ` Theodor Thornhill
@ 2022-11-13 12:55                         ` Eli Zaretskii
  2022-11-13 13:02                           ` Theodor Thornhill
  0 siblings, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-13 12:55 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: casouri, emacs-devel, monnier

> Date: Sun, 13 Nov 2022 11:13:24 +0100
> From: Theodor Thornhill <theo@thornhill.no>
> CC: casouri@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca
> 
> >> I think the best solution is just to remove the
> >> 
> >> ```
> >>    :language mode
> >>    :override t
> >>    :feature 'error
> >>    '((ERROR) @font-lock-warning-face)
> >> ```
> >
> >What are the downsides of removing this? what will we lose?
> >
> 
> Absolutely nothing. Only the yellow color.

Which yellow color?

And if we lose nothing, why was this in the code to begin with? what
was the thought behind using it?



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13 12:55                         ` Eli Zaretskii
@ 2022-11-13 13:02                           ` Theodor Thornhill
  2022-11-13 13:08                             ` Eli Zaretskii
  2022-11-14  1:23                             ` Dmitry Gutov
  0 siblings, 2 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-13 13:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, emacs-devel, monnier

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Sun, 13 Nov 2022 11:13:24 +0100
>> From: Theodor Thornhill <theo@thornhill.no>
>> CC: casouri@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca
>> 
>> >> I think the best solution is just to remove the
>> >> 
>> >> ```
>> >>    :language mode
>> >>    :override t
>> >>    :feature 'error
>> >>    '((ERROR) @font-lock-warning-face)
>> >> ```
>> >
>> >What are the downsides of removing this? what will we lose?
>> >
>> 
>> Absolutely nothing. Only the yellow color.
>
> Which yellow color?
>
> And if we lose nothing, why was this in the code to begin with? what
> was the thought behind using it?

The warning face.  My thinking was merely that identifying syntax errors
could be useful, but I now believe this causes more noise and confusion
rather than being a helpful tool.

So there's no need to use it.  These error-nodes can be seen with
'treesit-inspect-mode' for those interested anyways.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13  8:37       ` Theodor Thornhill
@ 2022-11-13 13:03         ` Randy Taylor
  0 siblings, 0 replies; 83+ messages in thread
From: Randy Taylor @ 2022-11-13 13:03 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel, casouri

On Sunday, November 13th, 2022 at 03:37, Theodor Thornhill <theo@thornhill.no> wrote:
> 
> Bring the faces :-)
>

Working on 'em! Hoping to send a patch out later today.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13 13:02                           ` Theodor Thornhill
@ 2022-11-13 13:08                             ` Eli Zaretskii
  2022-11-13 13:37                               ` Theodor Thornhill
  2022-11-14  1:23                             ` Dmitry Gutov
  1 sibling, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-13 13:08 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: casouri, emacs-devel, monnier

> From: Theodor Thornhill <theo@thornhill.no>
> Cc: casouri@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca
> Date: Sun, 13 Nov 2022 14:02:30 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Date: Sun, 13 Nov 2022 11:13:24 +0100
> >> From: Theodor Thornhill <theo@thornhill.no>
> >> CC: casouri@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca
> >> 
> >> >> I think the best solution is just to remove the
> >> >> 
> >> >> ```
> >> >>    :language mode
> >> >>    :override t
> >> >>    :feature 'error
> >> >>    '((ERROR) @font-lock-warning-face)
> >> >> ```
> >> >
> >> >What are the downsides of removing this? what will we lose?
> >> >
> >> 
> >> Absolutely nothing. Only the yellow color.
> >
> > Which yellow color?
> >
> > And if we lose nothing, why was this in the code to begin with? what
> > was the thought behind using it?
> 
> The warning face.  My thinking was merely that identifying syntax errors
> could be useful, but I now believe this causes more noise and confusion
> rather than being a helpful tool.

I'm not sure I agree.  Having incomplete code show in a distinct color
could be a nice feature, at least as an option.  The problem is that
the warning face doesn't go away when the code is completed, but I
think this is a separate and different problem.

> So there's no need to use it.  These error-nodes can be seen with
> 'treesit-inspect-mode' for those interested anyways.

The inspect nod is for debugging TS-backed modes, not for editing
program source.  So it is not relevant to what bothers me.

I think we should fix the problem with the warning face staying put,
and then revisit the usefulness of the warning face.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13 13:08                             ` Eli Zaretskii
@ 2022-11-13 13:37                               ` Theodor Thornhill
  0 siblings, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-13 13:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, emacs-devel, monnier



On 13 November 2022 14:08:14 CET, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: casouri@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca
>> Date: Sun, 13 Nov 2022 14:02:30 +0100
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> Date: Sun, 13 Nov 2022 11:13:24 +0100
>> >> From: Theodor Thornhill <theo@thornhill.no>
>> >> CC: casouri@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca
>> >> 
>> >> >> I think the best solution is just to remove the
>> >> >> 
>> >> >> ```
>> >> >>    :language mode
>> >> >>    :override t
>> >> >>    :feature 'error
>> >> >>    '((ERROR) @font-lock-warning-face)
>> >> >> ```
>> >> >
>> >> >What are the downsides of removing this? what will we lose?
>> >> >
>> >> 
>> >> Absolutely nothing. Only the yellow color.
>> >
>> > Which yellow color?
>> >
>> > And if we lose nothing, why was this in the code to begin with? what
>> > was the thought behind using it?
>> 
>> The warning face.  My thinking was merely that identifying syntax errors
>> could be useful, but I now believe this causes more noise and confusion
>> rather than being a helpful tool.
>
>I'm not sure I agree.  Having incomplete code show in a distinct color
>could be a nice feature, at least as an option.  The problem is that
>the warning face doesn't go away when the code is completed, but I
>think this is a separate and different problem.
>
>> So there's no need to use it.  These error-nodes can be seen with
>> 'treesit-inspect-mode' for those interested anyways.
>
>The inspect nod is for debugging TS-backed modes, not for editing
>program source.  So it is not relevant to what bothers me.
>
>I think we should fix the problem with the warning face staying put,
>and then revisit the usefulness of the warning face.

Sure!

Yuan - do you have any pointers as to where I should look? Otherwise I'll just start reading somewhere and see if any clues appear.

Anyway - I'll also follow-up with another patch addressing some inconsistencies between the modes later today.

Theo



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13  9:56                     ` Eli Zaretskii
  2022-11-13 10:13                       ` Theodor Thornhill
@ 2022-11-14  0:22                       ` Yuan Fu
  2022-11-14  1:26                         ` Dmitry Gutov
                                           ` (2 more replies)
  1 sibling, 3 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-14  0:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Theodor Thornhill, emacs-devel, monnier



> On Nov 13, 2022, at 1:56 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: casouri@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca
>> Date: Sun, 13 Nov 2022 10:40:26 +0100
>> 
>>> But if I add an empty line at BOB, the fontification becomes as
>>> expected, and doesn't go back to font-lock-warning-face even if I then
>>> remove that empty line.
>>> 
>> 
>> This is likely due to either treesit or tree-sitter or tree-sitter-c not
>> dealing properly with the root node.  Maybe Yuan has some insight here?
> 
> This sounds like we don't update tree-sitter under some conditions,
> IOW a bug of sorts.

That’s just due to jit-lock. When jit-lock first fontifies

int
foo (void)
{

The parse tree has errors and it fontifies int in warning face.

Then when you insert the closing bracket, the parse tree is complete

int
foo (void)
{
 int bar = 0;
}

Int is still in warning face because jit-lock doesn’t know it needs to be refontified. When you insert a newline in BOB, jit-lock refortifies everything after the changed region, so int is refontified. 

So if we want the warning face to automatically disappear, we need to record these warning faces and remember to come back to refontify them later. We need to know when to refontify them, and know when to stop trying to refontify them (maybe the error isn’t transient). For now I think it’s best to just not fontify the error nodes. 

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-13 13:02                           ` Theodor Thornhill
  2022-11-13 13:08                             ` Eli Zaretskii
@ 2022-11-14  1:23                             ` Dmitry Gutov
  1 sibling, 0 replies; 83+ messages in thread
From: Dmitry Gutov @ 2022-11-14  1:23 UTC (permalink / raw)
  To: Theodor Thornhill, Eli Zaretskii; +Cc: casouri, emacs-devel, monnier

On 13.11.2022 15:02, Theodor Thornhill wrote:
> The warning face.  My thinking was merely that identifying syntax errors
> could be useful, but I now believe this causes more noise and confusion
> rather than being a helpful tool.

Indeed, my experience with js2-mode is that such indication is useful, 
to tell about the (potential) syntax error in the code.

Especially if otherwise the user will need to launch a slow-ish 
compilation process to be notified of such error. As long as the odds of 
false positives are low.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14  0:22                       ` Yuan Fu
@ 2022-11-14  1:26                         ` Dmitry Gutov
  2022-11-14  8:35                           ` Yuan Fu
  2022-11-14  3:48                         ` Stefan Monnier
  2022-11-14 12:55                         ` Eli Zaretskii
  2 siblings, 1 reply; 83+ messages in thread
From: Dmitry Gutov @ 2022-11-14  1:26 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii; +Cc: Theodor Thornhill, emacs-devel, monnier

On 14.11.2022 02:22, Yuan Fu wrote:
> So if we want the warning face to automatically disappear, we need to record these warning faces and remember to come back to refontify them later. We need to know when to refontify them, and know when to stop trying to refontify them (maybe the error isn’t transient). For now I think it’s best to just not fontify the error nodes.

I'm guessing the situation could be the reverse as well: after the user 
typing some chars, the warning would need to be *added* rather than 
removed, in some cases.

Any chance tree-sitter gives you some info/callbacks to convey the 
earliest node (closes to bob) which has changed after the most recent 
buffer modification? So we'd refontify starting with its beginning position.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14  0:22                       ` Yuan Fu
  2022-11-14  1:26                         ` Dmitry Gutov
@ 2022-11-14  3:48                         ` Stefan Monnier
  2022-11-14  8:23                           ` Yuan Fu
  2022-11-14 12:55                         ` Eli Zaretskii
  2 siblings, 1 reply; 83+ messages in thread
From: Stefan Monnier @ 2022-11-14  3:48 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, Theodor Thornhill, emacs-devel

> The parse tree has errors and it fontifies int in warning face.
>
> Then when you insert the closing bracket, the parse tree is complete
>
> int
> foo (void)
> {
>  int bar = 0;
> }
>
> Int is still in warning face because jit-lock doesn’t know it needs to be
> refontified.

Doesn't tree-sitter tell us that the node for `int` has changed?


        Stefan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14  3:48                         ` Stefan Monnier
@ 2022-11-14  8:23                           ` Yuan Fu
  2022-11-14 12:46                             ` Stefan Monnier
  2022-11-14 13:20                             ` Eli Zaretskii
  0 siblings, 2 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-14  8:23 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Theodor Thornhill, emacs-devel



> On Nov 13, 2022, at 7:48 PM, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>> The parse tree has errors and it fontifies int in warning face.
>> 
>> Then when you insert the closing bracket, the parse tree is complete
>> 
>> int
>> foo (void)
>> {
>> int bar = 0;
>> }
>> 
>> Int is still in warning face because jit-lock doesn’t know it needs to be
>> refontified.
> 
> Doesn't tree-sitter tell us that the node for `int` has changed?

Yes and no, but mostly no. Tree-sitter can tell if a node “has changes”. But you need to keep the node updated as the buffer changes, which we currently don’t do. Even if we add this feature, I don’t know if “has changes” includes “previously inside an ERROR node but not anymore”. IIUC “has changes” means “corresponding text edited”. I need to add this feature and experiment with it to figure out what does “has changes” mean exactly.

Keeping some nodes updated (ie, “watch” those nodes) isn’t too hard to implement, but it wouldn’t be a trivial change. I don’t know if we want to introduce non-trivial changes now.

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14  1:26                         ` Dmitry Gutov
@ 2022-11-14  8:35                           ` Yuan Fu
  2022-11-14 13:24                             ` Eli Zaretskii
  2022-11-14 19:54                             ` Dmitry Gutov
  0 siblings, 2 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-14  8:35 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, Theodor Thornhill, emacs-devel, monnier

[-- Attachment #1: Type: text/plain, Size: 1406 bytes --]



> On Nov 13, 2022, at 5:26 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
> 
> On 14.11.2022 02:22, Yuan Fu wrote:
>> So if we want the warning face to automatically disappear, we need to record these warning faces and remember to come back to refontify them later. We need to know when to refontify them, and know when to stop trying to refontify them (maybe the error isn’t transient). For now I think it’s best to just not fontify the error nodes.
> 
> I'm guessing the situation could be the reverse as well: after the user typing some chars, the warning would need to be *added* rather than removed, in some cases.

That’s a good perspective. But from what I see I think it’s best not to fontify these “errors”, at least for C and C++. Because a lot of things could be marked “error” in a C file, like stuff around macros. And in extreme cases the whole file is marked “error”, even though if we ignore the error everything is parsed fine. I guess tree-sitter isn’t happy about some tiny thing in that file but never the less can parse everything correctly. I attached that file below.

> Any chance tree-sitter gives you some info/callbacks to convey the earliest node (closes to bob) which has changed after the most recent buffer modification? So we'd refontify starting with its beginning position.

Yes and no, I explained in more detail in another message.



[-- Attachment #2: subtree.h --]
[-- Type: application/octet-stream, Size: 11753 bytes --]

#ifndef TREE_SITTER_SUBTREE_H_
#define TREE_SITTER_SUBTREE_H_

#ifdef __cplusplus
extern "C" {
#endif

#include <limits.h>
#include <stdbool.h>
#include <stdio.h>
#include "./length.h"
#include "./array.h"
#include "./error_costs.h"
#include "./host.h"
#include "tree_sitter/api.h"
#include "tree_sitter/parser.h"

#define TS_TREE_STATE_NONE USHRT_MAX
#define NULL_SUBTREE ((Subtree) {.ptr = NULL})

// The serialized state of an external scanner.
//
// Every time an external token subtree is created after a call to an
// external scanner, the scanner's `serialize` function is called to
// retrieve a serialized copy of its state. The bytes are then copied
// onto the subtree itself so that the scanner's state can later be
// restored using its `deserialize` function.
//
// Small byte arrays are stored inline, and long ones are allocated
// separately on the heap.
typedef struct {
  union {
    char *long_data;
    char short_data[24];
  };
  uint32_t length;
} ExternalScannerState;

// A compact representation of a subtree.
//
// This representation is used for small leaf nodes that are not
// errors, and were not created by an external scanner.
//
// The idea behind the layout of this struct is that the `is_inline`
// bit will fall exactly into the same location as the least significant
// bit of the pointer in `Subtree` or `MutableSubtree`, respectively.
// Because of alignment, for any valid pointer this will be 0, giving
// us the opportunity to make use of this bit to signify whether to use
// the pointer or the inline struct.
typedef struct SubtreeInlineData SubtreeInlineData;

#define SUBTREE_BITS    \
  bool visible : 1;     \
  bool named : 1;       \
  bool extra : 1;       \
  bool has_changes : 1; \
  bool is_missing : 1;  \
  bool is_keyword : 1;

#define SUBTREE_SIZE           \
  uint8_t padding_columns;     \
  uint8_t padding_rows : 4;    \
  uint8_t lookahead_bytes : 4; \
  uint8_t padding_bytes;       \
  uint8_t size_bytes;

#if TS_BIG_ENDIAN
#if TS_PTR_SIZE == 32

struct SubtreeInlineData {
  uint16_t parse_state;
  uint8_t symbol;
  SUBTREE_BITS
  bool unused : 1;
  bool is_inline : 1;
  SUBTREE_SIZE
};

#else

struct SubtreeInlineData {
  SUBTREE_SIZE
  uint16_t parse_state;
  uint8_t symbol;
  SUBTREE_BITS
  bool unused : 1;
  bool is_inline : 1;
};

#endif
#else

struct SubtreeInlineData {
  bool is_inline : 1;
  SUBTREE_BITS
  uint8_t symbol;
  uint16_t parse_state;
  SUBTREE_SIZE
};

#endif

#undef SUBTREE_BITS
#undef SUBTREE_SIZE

// A heap-allocated representation of a subtree.
//
// This representation is used for parent nodes, external tokens,
// errors, and other leaf nodes whose data is too large to fit into
// the inline representation.
typedef struct {
  volatile uint32_t ref_count;
  Length padding;
  Length size;
  uint32_t lookahead_bytes;
  uint32_t error_cost;
  uint32_t child_count;
  TSSymbol symbol;
  TSStateId parse_state;

  bool visible : 1;
  bool named : 1;
  bool extra : 1;
  bool fragile_left : 1;
  bool fragile_right : 1;
  bool has_changes : 1;
  bool has_external_tokens : 1;
  bool has_external_scanner_state_change : 1;
  bool depends_on_column: 1;
  bool is_missing : 1;
  bool is_keyword : 1;

  union {
    // Non-terminal subtrees (`child_count > 0`)
    struct {
      uint32_t visible_child_count;
      uint32_t named_child_count;
      uint32_t node_count;
      int32_t dynamic_precedence;
      uint16_t repeat_depth;
      uint16_t production_id;
      struct {
        TSSymbol symbol;
        TSStateId parse_state;
      } first_leaf;
    };

    // External terminal subtrees (`child_count == 0 && has_external_tokens`)
    ExternalScannerState external_scanner_state;

    // Error terminal subtrees (`child_count == 0 && symbol == ts_builtin_sym_error`)
    int32_t lookahead_char;
  };
} SubtreeHeapData;

// The fundamental building block of a syntax tree.
typedef union {
  SubtreeInlineData data;
  const SubtreeHeapData *ptr;
} Subtree;

// Like Subtree, but mutable.
typedef union {
  SubtreeInlineData data;
  SubtreeHeapData *ptr;
} MutableSubtree;

typedef Array(Subtree) SubtreeArray;
typedef Array(MutableSubtree) MutableSubtreeArray;

typedef struct {
  MutableSubtreeArray free_trees;
  MutableSubtreeArray tree_stack;
} SubtreePool;

void ts_external_scanner_state_init(ExternalScannerState *, const char *, unsigned);
const char *ts_external_scanner_state_data(const ExternalScannerState *);
bool ts_external_scanner_state_eq(const ExternalScannerState *a, const char *, unsigned);
void ts_external_scanner_state_delete(ExternalScannerState *self);

void ts_subtree_array_copy(SubtreeArray, SubtreeArray *);
void ts_subtree_array_clear(SubtreePool *, SubtreeArray *);
void ts_subtree_array_delete(SubtreePool *, SubtreeArray *);
void ts_subtree_array_remove_trailing_extras(SubtreeArray *, SubtreeArray *);
void ts_subtree_array_reverse(SubtreeArray *);

SubtreePool ts_subtree_pool_new(uint32_t capacity);
void ts_subtree_pool_delete(SubtreePool *);

Subtree ts_subtree_new_leaf(
  SubtreePool *, TSSymbol, Length, Length, uint32_t,
  TSStateId, bool, bool, bool, const TSLanguage *
);
Subtree ts_subtree_new_error(
  SubtreePool *, int32_t, Length, Length, uint32_t, TSStateId, const TSLanguage *
);
MutableSubtree ts_subtree_new_node(TSSymbol, SubtreeArray *, unsigned, const TSLanguage *);
Subtree ts_subtree_new_error_node(SubtreeArray *, bool, const TSLanguage *);
Subtree ts_subtree_new_missing_leaf(SubtreePool *, TSSymbol, Length, uint32_t, const TSLanguage *);
MutableSubtree ts_subtree_make_mut(SubtreePool *, Subtree);
void ts_subtree_retain(Subtree);
void ts_subtree_release(SubtreePool *, Subtree);
int ts_subtree_compare(Subtree, Subtree);
void ts_subtree_set_symbol(MutableSubtree *, TSSymbol, const TSLanguage *);
void ts_subtree_summarize(MutableSubtree, const Subtree *, uint32_t, const TSLanguage *);
void ts_subtree_summarize_children(MutableSubtree, const TSLanguage *);
void ts_subtree_balance(Subtree, SubtreePool *, const TSLanguage *);
Subtree ts_subtree_edit(Subtree, const TSInputEdit *edit, SubtreePool *);
char *ts_subtree_string(Subtree, const TSLanguage *, bool include_all);
void ts_subtree_print_dot_graph(Subtree, const TSLanguage *, FILE *);
Subtree ts_subtree_last_external_token(Subtree);
const ExternalScannerState *ts_subtree_external_scanner_state(Subtree self);
bool ts_subtree_external_scanner_state_eq(Subtree, Subtree);

#define SUBTREE_GET(self, name) (self.data.is_inline ? self.data.name : self.ptr->name)

static inline TSSymbol ts_subtree_symbol(Subtree self) { return SUBTREE_GET(self, symbol); }
static inline bool ts_subtree_visible(Subtree self) { return SUBTREE_GET(self, visible); }
static inline bool ts_subtree_named(Subtree self) { return SUBTREE_GET(self, named); }
static inline bool ts_subtree_extra(Subtree self) { return SUBTREE_GET(self, extra); }
static inline bool ts_subtree_has_changes(Subtree self) { return SUBTREE_GET(self, has_changes); }
static inline bool ts_subtree_missing(Subtree self) { return SUBTREE_GET(self, is_missing); }
static inline bool ts_subtree_is_keyword(Subtree self) { return SUBTREE_GET(self, is_keyword); }
static inline TSStateId ts_subtree_parse_state(Subtree self) { return SUBTREE_GET(self, parse_state); }
static inline uint32_t ts_subtree_lookahead_bytes(Subtree self) { return SUBTREE_GET(self, lookahead_bytes); }

#undef SUBTREE_GET

// Get the size needed to store a heap-allocated subtree with the given
// number of children.
static inline size_t ts_subtree_alloc_size(uint32_t child_count) {
  return child_count * sizeof(Subtree) + sizeof(SubtreeHeapData);
}

// Get a subtree's children, which are allocated immediately before the
// tree's own heap data.
#define ts_subtree_children(self) \
  ((self).data.is_inline ? NULL : (Subtree *)((self).ptr) - (self).ptr->child_count)

static inline void ts_subtree_set_extra(MutableSubtree *self, bool is_extra) {
  if (self->data.is_inline) {
    self->data.extra = is_extra;
  } else {
    self->ptr->extra = is_extra;
  }
}

static inline TSSymbol ts_subtree_leaf_symbol(Subtree self) {
  if (self.data.is_inline) return self.data.symbol;
  if (self.ptr->child_count == 0) return self.ptr->symbol;
  return self.ptr->first_leaf.symbol;
}

static inline TSStateId ts_subtree_leaf_parse_state(Subtree self) {
  if (self.data.is_inline) return self.data.parse_state;
  if (self.ptr->child_count == 0) return self.ptr->parse_state;
  return self.ptr->first_leaf.parse_state;
}

static inline Length ts_subtree_padding(Subtree self) {
  if (self.data.is_inline) {
    Length result = {self.data.padding_bytes, {self.data.padding_rows, self.data.padding_columns}};
    return result;
  } else {
    return self.ptr->padding;
  }
}

static inline Length ts_subtree_size(Subtree self) {
  if (self.data.is_inline) {
    Length result = {self.data.size_bytes, {0, self.data.size_bytes}};
    return result;
  } else {
    return self.ptr->size;
  }
}

static inline Length ts_subtree_total_size(Subtree self) {
  return length_add(ts_subtree_padding(self), ts_subtree_size(self));
}

static inline uint32_t ts_subtree_total_bytes(Subtree self) {
  return ts_subtree_total_size(self).bytes;
}

static inline uint32_t ts_subtree_child_count(Subtree self) {
  return self.data.is_inline ? 0 : self.ptr->child_count;
}

static inline uint32_t ts_subtree_repeat_depth(Subtree self) {
  return self.data.is_inline ? 0 : self.ptr->repeat_depth;
}

static inline uint32_t ts_subtree_node_count(Subtree self) {
  return (self.data.is_inline || self.ptr->child_count == 0) ? 1 : self.ptr->node_count;
}

static inline uint32_t ts_subtree_visible_child_count(Subtree self) {
  if (ts_subtree_child_count(self) > 0) {
    return self.ptr->visible_child_count;
  } else {
    return 0;
  }
}

static inline uint32_t ts_subtree_error_cost(Subtree self) {
  if (ts_subtree_missing(self)) {
    return ERROR_COST_PER_MISSING_TREE + ERROR_COST_PER_RECOVERY;
  } else {
    return self.data.is_inline ? 0 : self.ptr->error_cost;
  }
}

static inline int32_t ts_subtree_dynamic_precedence(Subtree self) {
  return (self.data.is_inline || self.ptr->child_count == 0) ? 0 : self.ptr->dynamic_precedence;
}

static inline uint16_t ts_subtree_production_id(Subtree self) {
  if (ts_subtree_child_count(self) > 0) {
    return self.ptr->production_id;
  } else {
    return 0;
  }
}

static inline bool ts_subtree_fragile_left(Subtree self) {
  return self.data.is_inline ? false : self.ptr->fragile_left;
}

static inline bool ts_subtree_fragile_right(Subtree self) {
  return self.data.is_inline ? false : self.ptr->fragile_right;
}

static inline bool ts_subtree_has_external_tokens(Subtree self) {
  return self.data.is_inline ? false : self.ptr->has_external_tokens;
}

static inline bool ts_subtree_has_external_scanner_state_change(Subtree self) {
  return self.data.is_inline ? false : self.ptr->has_external_scanner_state_change;
}

static inline bool ts_subtree_depends_on_column(Subtree self) {
  return self.data.is_inline ? false : self.ptr->depends_on_column;
}

static inline bool ts_subtree_is_fragile(Subtree self) {
  return self.data.is_inline ? false : (self.ptr->fragile_left || self.ptr->fragile_right);
}

static inline bool ts_subtree_is_error(Subtree self) {
  return ts_subtree_symbol(self) == ts_builtin_sym_error;
}

static inline bool ts_subtree_is_eof(Subtree self) {
  return ts_subtree_symbol(self) == ts_builtin_sym_end;
}

static inline Subtree ts_subtree_from_mut(MutableSubtree self) {
  Subtree result;
  result.data = self.data;
  return result;
}

static inline MutableSubtree ts_subtree_to_mut_unsafe(Subtree self) {
  MutableSubtree result;
  result.data = self.data;
  return result;
}

#ifdef __cplusplus
}
#endif

#endif  // TREE_SITTER_SUBTREE_H_

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14  8:23                           ` Yuan Fu
@ 2022-11-14 12:46                             ` Stefan Monnier
  2022-11-14 13:20                             ` Eli Zaretskii
  1 sibling, 0 replies; 83+ messages in thread
From: Stefan Monnier @ 2022-11-14 12:46 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, Theodor Thornhill, emacs-devel

>> On Nov 13, 2022, at 7:48 PM, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> 
>>> The parse tree has errors and it fontifies int in warning face.
>>> 
>>> Then when you insert the closing bracket, the parse tree is complete
>>> 
>>> int
>>> foo (void)
>>> {
>>> int bar = 0;
>>> }
>>> 
>>> Int is still in warning face because jit-lock doesn’t know it needs to be
>>> refontified.
>> 
>> Doesn't tree-sitter tell us that the node for `int` has changed?
>
> Yes and no, but mostly no. Tree-sitter can tell if a node “has changes”.

I mean: when we send to tree-sitter a new version of the buffer text
(i.e. we ask it to perform an incremental reparse), it tells us which
parts of the tree have changed, right?  If so, does this include the
part containing the "int" node in the above case?


        Stefan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14  0:22                       ` Yuan Fu
  2022-11-14  1:26                         ` Dmitry Gutov
  2022-11-14  3:48                         ` Stefan Monnier
@ 2022-11-14 12:55                         ` Eli Zaretskii
  2 siblings, 0 replies; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-14 12:55 UTC (permalink / raw)
  To: Yuan Fu; +Cc: theo, emacs-devel, monnier

> From: Yuan Fu <casouri@gmail.com>
> Date: Sun, 13 Nov 2022 16:22:36 -0800
> Cc: Theodor Thornhill <theo@thornhill.no>,
>  emacs-devel <emacs-devel@gnu.org>,
>  monnier@iro.umontreal.ca
> 
> That’s just due to jit-lock. When jit-lock first fontifies
> 
> int
> foo (void)
> {
> 
> The parse tree has errors and it fontifies int in warning face.
> 
> Then when you insert the closing bracket, the parse tree is complete
> 
> int
> foo (void)
> {
>  int bar = 0;
> }
> 
> Int is still in warning face because jit-lock doesn’t know it needs to be refontified. When you insert a newline in BOB, jit-lock refortifies everything after the changed region, so int is refontified. 
> 
> So if we want the warning face to automatically disappear, we need to record these warning faces and remember to come back to refontify them later. We need to know when to refontify them, and know when to stop trying to refontify them (maybe the error isn’t transient). For now I think it’s best to just not fontify the error nodes. 

I don't think this will fly with the users.  Leaving valid code marked
with the warning face is a misfeature, and the fact that it disappears
after a random change before the offending text makes that apparent to
everyone.  We should fix this one way or the other, or we will get
many justified bug reports.

Please research the possible ways of knowing which nodes that were in
error are no longer in error (or vice versa), and let's discuss how
best to solve this problem.  We cannot release Emacs 29 with modes
which exhibit this problematic behavior.

Thanks.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14  8:23                           ` Yuan Fu
  2022-11-14 12:46                             ` Stefan Monnier
@ 2022-11-14 13:20                             ` Eli Zaretskii
  2022-11-14 18:29                               ` Yuan Fu
  1 sibling, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-14 13:20 UTC (permalink / raw)
  To: Yuan Fu; +Cc: monnier, theo, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Mon, 14 Nov 2022 00:23:20 -0800
> Cc: Eli Zaretskii <eliz@gnu.org>,
>  Theodor Thornhill <theo@thornhill.no>,
>  emacs-devel <emacs-devel@gnu.org>
> 
> >> Then when you insert the closing bracket, the parse tree is complete
> >> 
> >> int
> >> foo (void)
> >> {
> >> int bar = 0;
> >> }
> >> 
> >> Int is still in warning face because jit-lock doesn’t know it needs to be
> >> refontified.
> > 
> > Doesn't tree-sitter tell us that the node for `int` has changed?
> 
> Yes and no, but mostly no. Tree-sitter can tell if a node “has changes”. But you need to keep the node updated as the buffer changes, which we currently don’t do.

Sorry, I don't understand: if the node's text did not change, and some
other node (which did change) caused the first node to become
"not-in-error", then why do we need to update the first node?  And if
the text of the node with the error did change, then we do update the
node, don't we?  So what is the problem here, exactly?  Or maybe I
misunderstand what you mean by "update the node"?

> Even if we add this feature, I don’t know if “has changes” includes “previously inside an ERROR node but not anymore”. IIUC “has changes” means “corresponding text edited”. I need to add this feature and experiment with it to figure out what does “has changes” mean exactly.

Please do.  We must solve this problem.

Btw, do other IDEs that use tree-sitter have the same problem?  I
doubt that, and if I'm right, we cannot afford having this problem in
Emacs.

> Keeping some nodes updated (ie, “watch” those nodes) isn’t too hard to implement, but it wouldn’t be a trivial change. I don’t know if we want to introduce non-trivial changes now.

If there are less invasive changes which could solve this, I agree.
But if this is the only way, we have no choice, I think.  Again, it
would be good to find out how other IDEs solve this.

And don't worry too much about non-trivial changes, we have ample time
before the release of Emacs 29 to find and fix any fallout.

Thanks.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14  8:35                           ` Yuan Fu
@ 2022-11-14 13:24                             ` Eli Zaretskii
  2022-11-14 18:31                               ` Yuan Fu
  2022-11-14 19:54                             ` Dmitry Gutov
  1 sibling, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-14 13:24 UTC (permalink / raw)
  To: Yuan Fu; +Cc: dgutov, theo, emacs-devel, monnier

> From: Yuan Fu <casouri@gmail.com>
> Date: Mon, 14 Nov 2022 00:35:47 -0800
> Cc: Eli Zaretskii <eliz@gnu.org>,
>  Theodor Thornhill <theo@thornhill.no>,
>  emacs-devel <emacs-devel@gnu.org>,
>  monnier@iro.umontreal.ca
> 
> That’s a good perspective. But from what I see I think it’s best not to fontify these “errors”, at least for C and C++. Because a lot of things could be marked “error” in a C file, like stuff around macros. And in extreme cases the whole file is marked “error”, even though if we ignore the error everything is parsed fine. I guess tree-sitter isn’t happy about some tiny thing in that file but never the less can parse everything correctly. I attached that file below.

If these false positives happen frequently, it could be a user option;
some users will prefer false positives to false negatives.  But I
definitely think that the warning face can be a valuable feature in
some use cases, so flatly dismissing it is probably not the best
alternative.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14 13:20                             ` Eli Zaretskii
@ 2022-11-14 18:29                               ` Yuan Fu
  2022-11-14 18:45                                 ` Eli Zaretskii
  0 siblings, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-14 18:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, theo, emacs-devel



> On Nov 14, 2022, at 5:20 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Mon, 14 Nov 2022 00:23:20 -0800
>> Cc: Eli Zaretskii <eliz@gnu.org>,
>> Theodor Thornhill <theo@thornhill.no>,
>> emacs-devel <emacs-devel@gnu.org>
>> 
>>>> Then when you insert the closing bracket, the parse tree is complete
>>>> 
>>>> int
>>>> foo (void)
>>>> {
>>>> int bar = 0;
>>>> }
>>>> 
>>>> Int is still in warning face because jit-lock doesn’t know it needs to be
>>>> refontified.
>>> 
>>> Doesn't tree-sitter tell us that the node for `int` has changed?
>> 
>> Yes and no, but mostly no. Tree-sitter can tell if a node “has changes”. But you need to keep the node updated as the buffer changes, which we currently don’t do.
> 
> Sorry, I don't understand: if the node's text did not change, and some
> other node (which did change) caused the first node to become
> "not-in-error", then why do we need to update the first node?  

Not specific to this node. I was saying that for any node to keep up with changes made to the buffer text, they need to be updated with “insertion in X, deletion from X to Y”. This is required by tree-sitter’s API. For this particular node, not updating the node might be ok, depending on hoe tree-sitter implements things. But of course we shouldn’t rely on that.

> And if
> the text of the node with the error did change, then we do update the
> node, don't we?

Well we update the parse tree and re-parse, but we currently don’t update the nodes created from the old tree. Keeping all nodes updated requires us to track all live nodes and update them whenever the buffer is edited.

>  So what is the problem here, exactly?  Or maybe I
> misunderstand what you mean by "update the node"?



> 
>> Even if we add this feature, I don’t know if “has changes” includes “previously inside an ERROR node but not anymore”. IIUC “has changes” means “corresponding text edited”. I need to add this feature and experiment with it to figure out what does “has changes” mean exactly.
> 
> Please do.  We must solve this problem.
> 
> Btw, do other IDEs that use tree-sitter have the same problem?  I
> doubt that, and if I'm right, we cannot afford having this problem in
> Emacs.

I wouldn’t call this a problem. The “error” in tree-sitter is not like complete parse failure. Let’s not highlight syntax errors for now, and see how it looks. In the meantime I’ll add the feature to track certain nodes for changes. Then if we decide this is an important feature to have, we can look at how to implement it.

I don’t think Atom highlight parse errors, neovim disables it by default.

> 
>> Keeping some nodes updated (ie, “watch” those nodes) isn’t too hard to implement, but it wouldn’t be a trivial change. I don’t know if we want to introduce non-trivial changes now.
> 
> If there are less invasive changes which could solve this, I agree.
> But if this is the only way, we have no choice, I think.  Again, it
> would be good to find out how other IDEs solve this.

Neovim used to highlight errors, but then disabled it by default[1]. I don’t know how does neovim fontify text, I will ask them if they have this problem and how did they solve it.

> 
> And don't worry too much about non-trivial changes, we have ample time
> before the release of Emacs 29 to find and fix any fallout.

Cool! Will do.

[1] https://github.com/nvim-treesitter/nvim-treesitter/commit/1a42056e092bc34ba081cb924bf0b3e3cd8cdc01

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14 13:24                             ` Eli Zaretskii
@ 2022-11-14 18:31                               ` Yuan Fu
  0 siblings, 0 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-14 18:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dgutov, theo, emacs-devel, monnier



> On Nov 14, 2022, at 5:24 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Mon, 14 Nov 2022 00:35:47 -0800
>> Cc: Eli Zaretskii <eliz@gnu.org>,
>> Theodor Thornhill <theo@thornhill.no>,
>> emacs-devel <emacs-devel@gnu.org>,
>> monnier@iro.umontreal.ca
>> 
>> That’s a good perspective. But from what I see I think it’s best not to fontify these “errors”, at least for C and C++. Because a lot of things could be marked “error” in a C file, like stuff around macros. And in extreme cases the whole file is marked “error”, even though if we ignore the error everything is parsed fine. I guess tree-sitter isn’t happy about some tiny thing in that file but never the less can parse everything correctly. I attached that file below.
> 
> If these false positives happen frequently, it could be a user option;
> some users will prefer false positives to false negatives.  But I
> definitely think that the warning face can be a valuable feature in
> some use cases, so flatly dismissing it is probably not the best
> alternative.

Right. And I expect false positives to be low in any language that’s not C/C++ (because macros). Let’s see what can we do.

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14 18:29                               ` Yuan Fu
@ 2022-11-14 18:45                                 ` Eli Zaretskii
  2022-11-14 19:51                                   ` Yuan Fu
  0 siblings, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-14 18:45 UTC (permalink / raw)
  To: Yuan Fu; +Cc: monnier, theo, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Mon, 14 Nov 2022 10:29:56 -0800
> Cc: monnier@iro.umontreal.ca,
>  theo@thornhill.no,
>  emacs-devel@gnu.org
> 
> > Sorry, I don't understand: if the node's text did not change, and some
> > other node (which did change) caused the first node to become
> > "not-in-error", then why do we need to update the first node?  
> 
> Not specific to this node. I was saying that for any node to keep up with changes made to the buffer text, they need to be updated with “insertion in X, deletion from X to Y”. This is required by tree-sitter’s API. For this particular node, not updating the node might be ok, depending on hoe tree-sitter implements things. But of course we shouldn’t rely on that.

You mean, we must report the same "insertion in X, deletion from X to
Y" change more than one time, because more than one node may depend on
that change?

> > And if
> > the text of the node with the error did change, then we do update the
> > node, don't we?
> 
> Well we update the parse tree and re-parse, but we currently don’t update the nodes created from the old tree. Keeping all nodes updated requires us to track all live nodes and update them whenever the buffer is edited.

I guess I still don't understand what exactly do you mean by "update
the node".  Can you explain that in more detail?

> > Please do.  We must solve this problem.
> > 
> > Btw, do other IDEs that use tree-sitter have the same problem?  I
> > doubt that, and if I'm right, we cannot afford having this problem in
> > Emacs.
> 
> I wouldn’t call this a problem.

It's a problem because highlighting in warning face is left on
display, although the program source is correct and should have been
highlighted differently.

> Let’s not highlight syntax errors for now, and see how it looks. In the meantime I’ll add the feature to track certain nodes for changes. Then if we decide this is an important feature to have, we can look at how to implement it.

If you are working on this issue, we can leave things as they are in
the meantime.  Having the warning face show will provide motivation
for solving this ;-)

> I don’t think Atom highlight parse errors, neovim disables it by default.

We could decide to disable it by default, but let's first see how to
solve the problem, or at least minimize it.

Thanks.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14 18:45                                 ` Eli Zaretskii
@ 2022-11-14 19:51                                   ` Yuan Fu
  2022-11-14 20:10                                     ` Eli Zaretskii
  0 siblings, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-14 19:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, theo, emacs-devel



> On Nov 14, 2022, at 10:45 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Mon, 14 Nov 2022 10:29:56 -0800
>> Cc: monnier@iro.umontreal.ca,
>> theo@thornhill.no,
>> emacs-devel@gnu.org
>> 
>>> Sorry, I don't understand: if the node's text did not change, and some
>>> other node (which did change) caused the first node to become
>>> "not-in-error", then why do we need to update the first node?  
>> 
>> Not specific to this node. I was saying that for any node to keep up with changes made to the buffer text, they need to be updated with “insertion in X, deletion from X to Y”. This is required by tree-sitter’s API. For this particular node, not updating the node might be ok, depending on hoe tree-sitter implements things. But of course we shouldn’t rely on that.
> 
> You mean, we must report the same "insertion in X, deletion from X to
> Y" change more than one time, because more than one node may depend on
> that change?
> 
>>> And if
>>> the text of the node with the error did change, then we do update the
>>> node, don't we?
>> 
>> Well we update the parse tree and re-parse, but we currently don’t update the nodes created from the old tree. Keeping all nodes updated requires us to track all live nodes and update them whenever the buffer is edited.
> 
> I guess I still don't understand what exactly do you mean by "update
> the node".  Can you explain that in more detail?

My bad. So when buffer changes (insert in X, delete from X to Y), we inform tree-sitter of this change by “updating” the tree:

  const TSInputEdit edit =
    treesit_prepare_input_edit (start_byte, old_end_byte, new_end_byte);

  ts_tree_edit (tree, &edit);

Then when we re-parse, tree-sitter knows which part of the buffer has changed and needs to be re-parsed, and only parses those, hence “incremental parsing”. 

Tree-sitter nodes needs similar updates, so that it is in sync with the buffer text.

> 
>>> Please do.  We must solve this problem.
>>> 
>>> Btw, do other IDEs that use tree-sitter have the same problem?  I
>>> doubt that, and if I'm right, we cannot afford having this problem in
>>> Emacs.
>> 
>> I wouldn’t call this a problem.
> 
> It's a problem because highlighting in warning face is left on
> display, although the program source is correct and should have been
> highlighted differently.

I agree, I meant that not highlighting syntax errors isn’t a problem per se. 

> 
>> Let’s not highlight syntax errors for now, and see how it looks. In the meantime I’ll add the feature to track certain nodes for changes. Then if we decide this is an important feature to have, we can look at how to implement it.
> 
> If you are working on this issue, we can leave things as they are in
> the meantime.  Having the warning face show will provide motivation
> for solving this ;-)
> 
>> I don’t think Atom highlight parse errors, neovim disables it by default.
> 
> We could decide to disable it by default, but let's first see how to
> solve the problem, or at least minimize it.

Yep.

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14  8:35                           ` Yuan Fu
  2022-11-14 13:24                             ` Eli Zaretskii
@ 2022-11-14 19:54                             ` Dmitry Gutov
  2022-11-15 10:56                               ` Yuan Fu
  1 sibling, 1 reply; 83+ messages in thread
From: Dmitry Gutov @ 2022-11-14 19:54 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, Theodor Thornhill, emacs-devel, monnier

On 14.11.2022 10:35, Yuan Fu wrote:
> 
> 
>> On Nov 13, 2022, at 5:26 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
>>
>> On 14.11.2022 02:22, Yuan Fu wrote:
>>> So if we want the warning face to automatically disappear, we need to record these warning faces and remember to come back to refontify them later. We need to know when to refontify them, and know when to stop trying to refontify them (maybe the error isn’t transient). For now I think it’s best to just not fontify the error nodes.
>>
>> I'm guessing the situation could be the reverse as well: after the user typing some chars, the warning would need to be *added* rather than removed, in some cases.
> 
> That’s a good perspective. But from what I see I think it’s best not to fontify these “errors”, at least for C and C++. Because a lot of things could be marked “error” in a C file, like stuff around macros. And in extreme cases the whole file is marked “error”, even though if we ignore the error everything is parsed fine. I guess tree-sitter isn’t happy about some tiny thing in that file but never the less can parse everything correctly. I attached that file below.

Perhaps not in C/C++, but other langs could use them.

Also (and here I'm really guessing, not sure what the 
limitations/benefits of TS grammars are) there might be other nodes 
which could change due to the user writing or deleting code on 
subsequent lines.

>> Any chance tree-sitter gives you some info/callbacks to convey the earliest node (closes to bob) which has changed after the most recent buffer modification? So we'd refontify starting with its beginning position.
> 
> Yes and no, I explained in more detail in another message.

If you're referring to this grandparent message:

 > jit-lock doesn’t know it needs to be refontified

...then I suppose it's a matter of letting it know somehow. I haven't 
read the TS integration code yet, so I'm not sure at which level it 
integrates with jit-lock.

But jit-lock-functions are allowed to fontify more than the passed in 
boundaries, for example.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14 19:51                                   ` Yuan Fu
@ 2022-11-14 20:10                                     ` Eli Zaretskii
  2022-11-14 21:57                                       ` Yuan Fu
  0 siblings, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-14 20:10 UTC (permalink / raw)
  To: Yuan Fu; +Cc: monnier, theo, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Mon, 14 Nov 2022 11:51:59 -0800
> Cc: monnier@iro.umontreal.ca,
>  theo@thornhill.no,
>  emacs-devel@gnu.org
> 
> >> Well we update the parse tree and re-parse, but we currently don’t update the nodes created from the old tree. Keeping all nodes updated requires us to track all live nodes and update them whenever the buffer is edited.
> > 
> > I guess I still don't understand what exactly do you mean by "update
> > the node".  Can you explain that in more detail?
> 
> My bad. So when buffer changes (insert in X, delete from X to Y), we inform tree-sitter of this change by “updating” the tree:
> 
>   const TSInputEdit edit =
>     treesit_prepare_input_edit (start_byte, old_end_byte, new_end_byte);
> 
>   ts_tree_edit (tree, &edit);
> 
> Then when we re-parse, tree-sitter knows which part of the buffer has changed and needs to be re-parsed, and only parses those, hence “incremental parsing”. 
> 
> Tree-sitter nodes needs similar updates, so that it is in sync with the buffer text.

Doesn't the call to ts_tree_edit update those nodes?  That is, aren't
those nodes a part of the tree that gets updated by the ts_tree_edit
call?



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14 20:10                                     ` Eli Zaretskii
@ 2022-11-14 21:57                                       ` Yuan Fu
  2022-11-15  3:27                                         ` Eli Zaretskii
  0 siblings, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-14 21:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, theo, emacs-devel



> On Nov 14, 2022, at 12:10 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Mon, 14 Nov 2022 11:51:59 -0800
>> Cc: monnier@iro.umontreal.ca,
>> theo@thornhill.no,
>> emacs-devel@gnu.org
>> 
>>>> Well we update the parse tree and re-parse, but we currently don’t update the nodes created from the old tree. Keeping all nodes updated requires us to track all live nodes and update them whenever the buffer is edited.
>>> 
>>> I guess I still don't understand what exactly do you mean by "update
>>> the node".  Can you explain that in more detail?
>> 
>> My bad. So when buffer changes (insert in X, delete from X to Y), we inform tree-sitter of this change by “updating” the tree:
>> 
>>  const TSInputEdit edit =
>>    treesit_prepare_input_edit (start_byte, old_end_byte, new_end_byte);
>> 
>>  ts_tree_edit (tree, &edit);
>> 
>> Then when we re-parse, tree-sitter knows which part of the buffer has changed and needs to be re-parsed, and only parses those, hence “incremental parsing”. 
>> 
>> Tree-sitter nodes needs similar updates, so that it is in sync with the buffer text.
> 
> Doesn't the call to ts_tree_edit update those nodes?  

No.

> That is, aren't
> those nodes a part of the tree that gets updated by the ts_tree_edit
> call?

The node stores some information in itself (start_byte, end_byte, inlined data, etc), and references the tree for the rest. The information it stores needs to be updated separately.

Yuan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14 21:57                                       ` Yuan Fu
@ 2022-11-15  3:27                                         ` Eli Zaretskii
  2022-11-15 10:51                                           ` Yuan Fu
  0 siblings, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-15  3:27 UTC (permalink / raw)
  To: Yuan Fu; +Cc: monnier, theo, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Mon, 14 Nov 2022 13:57:46 -0800
> Cc: monnier@iro.umontreal.ca,
>  theo@thornhill.no,
>  emacs-devel@gnu.org
> 
> > Doesn't the call to ts_tree_edit update those nodes?  
> 
> No.
> 
> > That is, aren't
> > those nodes a part of the tree that gets updated by the ts_tree_edit
> > call?
> 
> The node stores some information in itself (start_byte, end_byte, inlined data, etc), and references the tree for the rest. The information it stores needs to be updated separately.

Can we do that?  What are the difficulties?



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15  3:27                                         ` Eli Zaretskii
@ 2022-11-15 10:51                                           ` Yuan Fu
  2022-11-15 11:37                                             ` Theodor Thornhill
  2022-11-15 15:03                                             ` Eli Zaretskii
  0 siblings, 2 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-15 10:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, theo, emacs-devel



> On Nov 14, 2022, at 7:27 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Mon, 14 Nov 2022 13:57:46 -0800
>> Cc: monnier@iro.umontreal.ca,
>> theo@thornhill.no,
>> emacs-devel@gnu.org
>> 
>>> Doesn't the call to ts_tree_edit update those nodes?  
>> 
>> No.
>> 
>>> That is, aren't
>>> those nodes a part of the tree that gets updated by the ts_tree_edit
>>> call?
>> 
>> The node stores some information in itself (start_byte, end_byte, inlined data, etc), and references the tree for the rest. The information it stores needs to be updated separately.
> 
> Can we do that?  What are the difficulties?

We can update the nodes. (With ts_node_edit, similar to ts_tree_edit.)

But, I looked further, and the facility for updating a node is not really what we need/want. I won’t go into details here, because… there is a feature perfect for our use case! Tree-sitter can tell you what has changed when you re-parse a tree, that’s exactly what we need and very easy to use. It’s foolish for me to overlook this feature.

Specifically, when we re-parse a buffer, we can compare the before/after parse tree for differences. Tree-sitter can tell us the ranges in which nodes have changed during that re-parse. The “int” in the original example would be included in the ranges reported.

I’ve pushed a change that utilizes this feature. If you pull the latest commit and open c-ts-mode, error faces should appear and disappear as you type. There is no documentation for now, but basically we now allow users to register “after-change-function”s to tree-sitter parsers. The parser will call these functions when with the changed ranges when it re-parses.

The new functions are treesit-parser-add-notifier, treesit-parser-notifiers, treesit-parser-remove-notifier. I didn’t use treesit-parser-add-after-change-function because that is hideously long.

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-14 19:54                             ` Dmitry Gutov
@ 2022-11-15 10:56                               ` Yuan Fu
  2022-11-15 12:30                                 ` Dmitry Gutov
  0 siblings, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-15 10:56 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, Theodor Thornhill, emacs-devel, monnier



> On Nov 14, 2022, at 11:54 AM, Dmitry Gutov <dgutov@yandex.ru> wrote:
> 
> On 14.11.2022 10:35, Yuan Fu wrote:
>>> On Nov 13, 2022, at 5:26 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
>>> 
>>> On 14.11.2022 02:22, Yuan Fu wrote:
>>>> So if we want the warning face to automatically disappear, we need to record these warning faces and remember to come back to refontify them later. We need to know when to refontify them, and know when to stop trying to refontify them (maybe the error isn’t transient). For now I think it’s best to just not fontify the error nodes.
>>> 
>>> I'm guessing the situation could be the reverse as well: after the user typing some chars, the warning would need to be *added* rather than removed, in some cases.
>> That’s a good perspective. But from what I see I think it’s best not to fontify these “errors”, at least for C and C++. Because a lot of things could be marked “error” in a C file, like stuff around macros. And in extreme cases the whole file is marked “error”, even though if we ignore the error everything is parsed fine. I guess tree-sitter isn’t happy about some tiny thing in that file but never the less can parse everything correctly. I attached that file below.
> 
> Perhaps not in C/C++, but other langs could use them.
> 
> Also (and here I'm really guessing, not sure what the limitations/benefits of TS grammars are) there might be other nodes which could change due to the user writing or deleting code on subsequent lines.

Yes, this is common for comments and multi-line strings. And we can correctly handle them now, yay (see my other message.)

> 
>>> Any chance tree-sitter gives you some info/callbacks to convey the earliest node (closes to bob) which has changed after the most recent buffer modification? So we'd refontify starting with its beginning position.
>> Yes and no, I explained in more detail in another message.
> 
> If you're referring to this grandparent message:
> 
> > jit-lock doesn’t know it needs to be refontified
> 
> ...then I suppose it's a matter of letting it know somehow. I haven't read the TS integration code yet, so I'm not sure at which level it integrates with jit-lock.

It plugs into font-lock, but it also secretly knows about jit-lock… We now can tell what parts of the buffer are affected/changed, and we set fontified to nil in those areas so redisplay will refontify them.

> 
> But jit-lock-functions are allowed to fontify more than the passed in boundaries, for example.


Yuan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 10:51                                           ` Yuan Fu
@ 2022-11-15 11:37                                             ` Theodor Thornhill
  2022-11-15 15:03                                             ` Eli Zaretskii
  1 sibling, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-15 11:37 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii; +Cc: monnier, emacs-devel

>
> We can update the nodes. (With ts_node_edit, similar to ts_tree_edit.)
>
> But, I looked further, and the facility for updating a node is not
> really what we need/want. I won’t go into details here, because… there
> is a feature perfect for our use case! Tree-sitter can tell you what
> has changed when you re-parse a tree, that’s exactly what we need and
> very easy to use. It’s foolish for me to overlook this feature.
>
> Specifically, when we re-parse a buffer, we can compare the
> before/after parse tree for differences. Tree-sitter can tell us the
> ranges in which nodes have changed during that re-parse. The “int” in
> the original example would be included in the ranges reported.
>
> I’ve pushed a change that utilizes this feature. If you pull the
> latest commit and open c-ts-mode, error faces should appear and
> disappear as you type. There is no documentation for now, but
> basically we now allow users to register “after-change-function”s to
> tree-sitter parsers. The parser will call these functions when with
> the changed ranges when it re-parses.
>
> The new functions are treesit-parser-add-notifier,
> treesit-parser-notifiers, treesit-parser-remove-notifier. I didn’t use
> treesit-parser-add-after-change-function because that is hideously
> long.

It works like a charm!  Great job :-)

-- 
Theo



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 10:56                               ` Yuan Fu
@ 2022-11-15 12:30                                 ` Dmitry Gutov
  0 siblings, 0 replies; 83+ messages in thread
From: Dmitry Gutov @ 2022-11-15 12:30 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, Theodor Thornhill, emacs-devel, monnier

On 15.11.2022 12:56, Yuan Fu wrote:
>> Perhaps not in C/C++, but other langs could use them.
>>
>> Also (and here I'm really guessing, not sure what the limitations/benefits of TS grammars are) there might be other nodes which could change due to the user writing or deleting code on subsequent lines.
> Yes, this is common for comments and multi-line strings. And we can correctly handle them now, yay (see my other message.)
> 

Yay, thanks!



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 10:51                                           ` Yuan Fu
  2022-11-15 11:37                                             ` Theodor Thornhill
@ 2022-11-15 15:03                                             ` Eli Zaretskii
  2022-11-15 16:01                                               ` Stefan Monnier
  1 sibling, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-15 15:03 UTC (permalink / raw)
  To: Yuan Fu; +Cc: monnier, theo, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Tue, 15 Nov 2022 02:51:31 -0800
> Cc: monnier@iro.umontreal.ca,
>  theo@thornhill.no,
>  emacs-devel@gnu.org
> 
> But, I looked further, and the facility for updating a node is not really what we need/want. I won’t go into details here, because… there is a feature perfect for our use case! Tree-sitter can tell you what has changed when you re-parse a tree, that’s exactly what we need and very easy to use. It’s foolish for me to overlook this feature.
> 
> Specifically, when we re-parse a buffer, we can compare the before/after parse tree for differences. Tree-sitter can tell us the ranges in which nodes have changed during that re-parse. The “int” in the original example would be included in the ranges reported.
> 
> I’ve pushed a change that utilizes this feature. If you pull the latest commit and open c-ts-mode, error faces should appear and disappear as you type. There is no documentation for now, but basically we now allow users to register “after-change-function”s to tree-sitter parsers. The parser will call these functions when with the changed ranges when it re-parses.

Thanks, this works very well.  I think we can consider this issue
resolved.

Btw, I now see that some parts of our sources are displayed in warning
face, probably because Tree-sitter cannot cope with our macro usage.
For example:

  DEFUN ("set-buffer-redisplay", Fset_buffer_redisplay,
	 Sset_buffer_redisplay, 4, 4, 0,
	 doc: /* Mark the current buffer for redisplay.
  This function may be passed to `add-variable-watcher'.  */)
    (Lisp_Object symbol, Lisp_Object newval, Lisp_Object op, Lisp_Object where)
  {

This shows all the arguments in warning face.

  static void ATTRIBUTE_FORMAT_PRINTF (1, 2)
  redisplay_trace (char const *fmt, ...)
  {

This shows "1, 2" and "redisplay_trace" in warning face.

  extern bool trace_move EXTERNALLY_VISIBLE;

This shows "trace_move" in the warning face.

  #ifdef HAVE_WINDOW_SYSTEM
	if (part == ON_LEFT_MARGIN || part == ON_RIGHT_MARGIN)
	  {
	    cursor = FRAME_OUTPUT_DATA (f)->nontext_cursor;
	    /* Show non-text cursor (Bug#16647).  */
	    goto set_cursor;
	  }
	else
  #endif
	  return;

This shows "else" in the warning face.

I guess we need to report these to the developers of the Tree-sitter's
C parser?  Is there anything else we could do until they fix the
parser?



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 15:03                                             ` Eli Zaretskii
@ 2022-11-15 16:01                                               ` Stefan Monnier
  2022-11-15 16:59                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 83+ messages in thread
From: Stefan Monnier @ 2022-11-15 16:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Yuan Fu, theo, emacs-devel

> I guess we need to report these to the developers of the Tree-sitter's
> C parser?  Is there anything else we could do until they fix
> the parser?

AFAIK the tree-sitter parser parses basically already-preprocessed C.
It's wickedly hard to parse meaningfully notyet-preprocessed C with
something based on a BNF grammar.
So my guess is that this is going to be a "wont fix".

There might be a way to handle it (basically by using the CPP
definitions to extend the BNF grammar on the fly), but it seems
difficult to include it into a parser based on compiling the BNF to
a state machine :-(


        Stefan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 16:01                                               ` Stefan Monnier
@ 2022-11-15 16:59                                                 ` Eli Zaretskii
  2022-11-15 18:18                                                   ` Yuan Fu
  2022-11-15 18:27                                                   ` Visuwesh
  0 siblings, 2 replies; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-15 16:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: casouri, theo, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Yuan Fu <casouri@gmail.com>,  theo@thornhill.no,  emacs-devel@gnu.org
> Date: Tue, 15 Nov 2022 11:01:24 -0500
> 
> > I guess we need to report these to the developers of the Tree-sitter's
> > C parser?  Is there anything else we could do until they fix
> > the parser?
> 
> AFAIK the tree-sitter parser parses basically already-preprocessed C.
> It's wickedly hard to parse meaningfully notyet-preprocessed C with
> something based on a BNF grammar.

There are a lot of macros in our code that tree-sitter based C mode
gets right, so I'm not sure this is accurate.

> So my guess is that this is going to be a "wont fix".

Maybe we should grow some augmentations for tree-sitter, at least
given enough time.  Or maybe it's possible to identify the parts where
this happens by some tree-sitter indications, and tweak the faces in
those regions in some way.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 16:59                                                 ` Eli Zaretskii
@ 2022-11-15 18:18                                                   ` Yuan Fu
  2022-11-15 18:38                                                     ` Eli Zaretskii
  2022-11-15 18:27                                                   ` Visuwesh
  1 sibling, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-15 18:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, theo, emacs-devel



> On Nov 15, 2022, at 8:59 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Yuan Fu <casouri@gmail.com>,  theo@thornhill.no,  emacs-devel@gnu.org
>> Date: Tue, 15 Nov 2022 11:01:24 -0500
>> 
>>> I guess we need to report these to the developers of the Tree-sitter's
>>> C parser?  Is there anything else we could do until they fix
>>> the parser?
>> 
>> AFAIK the tree-sitter parser parses basically already-preprocessed C.
>> It's wickedly hard to parse meaningfully notyet-preprocessed C with
>> something based on a BNF grammar.
> 
> There are a lot of macros in our code that tree-sitter based C mode
> gets right, so I'm not sure this is accurate.

My guess is that some heuristics + error recovery. 

> 
>> So my guess is that this is going to be a "wont fix".
> 
> Maybe we should grow some augmentations for tree-sitter, at least
> given enough time.  Or maybe it's possible to identify the parts where
> this happens by some tree-sitter indications, and tweak the faces in
> those regions in some way.

I don’t know how could you improve this in tree-sitter since macros are literally “define you own syntax”. We can reasonably fix the highlighting for some of our macros like DEFUN. As a demonstration I added some emacs-devel-specific rules (that are disabled by default). Run this:

(add-hook 'c-ts-mode-hook
          (lambda ()
            (treesit-font-lock-recompute-features '(emacs-devel))))

And restart c-ts-mode, and DEFUN’s should look normal now.

Yuan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 16:59                                                 ` Eli Zaretskii
  2022-11-15 18:18                                                   ` Yuan Fu
@ 2022-11-15 18:27                                                   ` Visuwesh
  2022-11-15 18:36                                                     ` Yuan Fu
  1 sibling, 1 reply; 83+ messages in thread
From: Visuwesh @ 2022-11-15 18:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, casouri, theo, emacs-devel

[செவ்வாய் நவம்பர் 15, 2022] Eli Zaretskii wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Yuan Fu <casouri@gmail.com>,  theo@thornhill.no,  emacs-devel@gnu.org
>> Date: Tue, 15 Nov 2022 11:01:24 -0500
>> 
>> > I guess we need to report these to the developers of the Tree-sitter's
>> > C parser?  Is there anything else we could do until they fix
>> > the parser?
>> 
>> AFAIK the tree-sitter parser parses basically already-preprocessed C.
>> It's wickedly hard to parse meaningfully notyet-preprocessed C with
>> something based on a BNF grammar.
>
> There are a lot of macros in our code that tree-sitter based C mode
> gets right, so I'm not sure this is accurate.
>
>> So my guess is that this is going to be a "wont fix".
>
> Maybe we should grow some augmentations for tree-sitter, at least
> given enough time.  Or maybe it's possible to identify the parts where
> this happens by some tree-sitter indications, and tweak the faces in
> those regions in some way.

I'm not sure how similar the emacs-tree-sitter
(https://github.com/ubolonton/emacs-tree-sitter) and Yuan's code are but
in his EmacsConf 2020 talk, Tuấn-Anh Nguyễn wrote some custom
tree-sitter query (?) to correctly parse our macros and highlighted the
type, the function name, etc. with the approriate faces.
You can find his talk here: https://emacsconf.org/2020/talks/23 and his
demonstration of the Emacs source code is around the 20 minute mark
(after a quick search in the subtitles).



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 18:27                                                   ` Visuwesh
@ 2022-11-15 18:36                                                     ` Yuan Fu
  0 siblings, 0 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-15 18:36 UTC (permalink / raw)
  To: Visuwesh
  Cc: Eli Zaretskii, Stefan Monnier, Theodor Thornhill, emacs-devel,
	Tuấn-Anh Nguyễn



> On Nov 15, 2022, at 10:27 AM, Visuwesh <visuweshm@gmail.com> wrote:
> 
> [செவ்வாய் நவம்பர் 15, 2022] Eli Zaretskii wrote:
> 
>>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>>> Cc: Yuan Fu <casouri@gmail.com>,  theo@thornhill.no,  emacs-devel@gnu.org
>>> Date: Tue, 15 Nov 2022 11:01:24 -0500
>>> 
>>>> I guess we need to report these to the developers of the Tree-sitter's
>>>> C parser?  Is there anything else we could do until they fix
>>>> the parser?
>>> 
>>> AFAIK the tree-sitter parser parses basically already-preprocessed C.
>>> It's wickedly hard to parse meaningfully notyet-preprocessed C with
>>> something based on a BNF grammar.
>> 
>> There are a lot of macros in our code that tree-sitter based C mode
>> gets right, so I'm not sure this is accurate.
>> 
>>> So my guess is that this is going to be a "wont fix".
>> 
>> Maybe we should grow some augmentations for tree-sitter, at least
>> given enough time.  Or maybe it's possible to identify the parts where
>> this happens by some tree-sitter indications, and tweak the faces in
>> those regions in some way.
> 
> I'm not sure how similar the emacs-tree-sitter
> (https://github.com/ubolonton/emacs-tree-sitter) and Yuan's code are but
> in his EmacsConf 2020 talk, Tuấn-Anh Nguyễn wrote some custom
> tree-sitter query (?) to correctly parse our macros and highlighted the
> type, the function name, etc. with the approriate faces.
> You can find his talk here: https://emacsconf.org/2020/talks/23 and his
> demonstration of the Emacs source code is around the 20 minute mark
> (after a quick search in the subtitles).

Oh nice, I’ll check that out. CCing him for some insights :-)

Yuan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 18:18                                                   ` Yuan Fu
@ 2022-11-15 18:38                                                     ` Eli Zaretskii
  2022-11-16  7:58                                                       ` Yuan Fu
  0 siblings, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-15 18:38 UTC (permalink / raw)
  To: Yuan Fu; +Cc: monnier, theo, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Tue, 15 Nov 2022 10:18:21 -0800
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,
>  theo@thornhill.no,
>  emacs-devel@gnu.org
> 
> > Maybe we should grow some augmentations for tree-sitter, at least
> > given enough time.  Or maybe it's possible to identify the parts where
> > this happens by some tree-sitter indications, and tweak the faces in
> > those regions in some way.
> 
> I don’t know how could you improve this in tree-sitter since macros are literally “define you own syntax”. We can reasonably fix the highlighting for some of our macros like DEFUN. As a demonstration I added some emacs-devel-specific rules (that are disabled by default). Run this:
> 
> (add-hook 'c-ts-mode-hook
>           (lambda ()
>             (treesit-font-lock-recompute-features '(emacs-devel))))
> 
> And restart c-ts-mode, and DEFUN’s should look normal now.

I will try that when I have time, thanks.  But if this indeed works
well, why not do something similar to fix more warnings?  And why not
make this the default, instead of asking users to write mode hooks?



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-15 18:38                                                     ` Eli Zaretskii
@ 2022-11-16  7:58                                                       ` Yuan Fu
  2022-11-16 13:16                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-16  7:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, theo, emacs-devel



> On Nov 15, 2022, at 10:38 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Tue, 15 Nov 2022 10:18:21 -0800
>> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,
>> theo@thornhill.no,
>> emacs-devel@gnu.org
>> 
>>> Maybe we should grow some augmentations for tree-sitter, at least
>>> given enough time.  Or maybe it's possible to identify the parts where
>>> this happens by some tree-sitter indications, and tweak the faces in
>>> those regions in some way.
>> 
>> I don’t know how could you improve this in tree-sitter since macros are literally “define you own syntax”. We can reasonably fix the highlighting for some of our macros like DEFUN. As a demonstration I added some emacs-devel-specific rules (that are disabled by default). Run this:
>> 
>> (add-hook 'c-ts-mode-hook
>>          (lambda ()
>>            (treesit-font-lock-recompute-features '(emacs-devel))))
>> 
>> And restart c-ts-mode, and DEFUN’s should look normal now.
> 
> I will try that when I have time, thanks.  But if this indeed works
> well, why not do something similar to fix more warnings?  

Truth to be told, I don’t know how to fix errors around preprocessor directive (#ifdef’s). Our macros shouldn’t be hard to fix. We can fix them as we encounter them.

> And why not
> make this the default, instead of asking users to write mode hooks?

Because this is Emacs-specific? It specifically fixes highlight for the DEFUN macro, nothing else. We can add some dir-local config that automatically enables this.

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-16  7:58                                                       ` Yuan Fu
@ 2022-11-16 13:16                                                         ` Eli Zaretskii
  2022-11-16 13:29                                                           ` Po Lu
  0 siblings, 1 reply; 83+ messages in thread
From: Eli Zaretskii @ 2022-11-16 13:16 UTC (permalink / raw)
  To: Yuan Fu; +Cc: monnier, theo, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Tue, 15 Nov 2022 23:58:57 -0800
> Cc: monnier@iro.umontreal.ca,
>  theo@thornhill.no,
>  emacs-devel@gnu.org
> 
> > And why not
> > make this the default, instead of asking users to write mode hooks?
> 
> Because this is Emacs-specific?

DEFUN is Emacs-specific, but the other examples I've shown aren't, I
see them in many other programs.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-16 13:16                                                         ` Eli Zaretskii
@ 2022-11-16 13:29                                                           ` Po Lu
  2022-11-16 17:29                                                             ` Yuan Fu
  0 siblings, 1 reply; 83+ messages in thread
From: Po Lu @ 2022-11-16 13:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Yuan Fu, monnier, theo, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> DEFUN is Emacs-specific, but the other examples I've shown aren't, I
> see them in many other programs.

Right.  What about "common" macros, such as ones that involve type
names?

For example, here is some sample GLib code, and GLib is a library used
widely throughout GNOME and freedesktop.org:

#ifdef HAVE_GTK3
static void emacs_menu_bar_get_preferred_width (GtkWidget *, gint *, gint *);
static GType emacs_menu_bar_get_type (void);

typedef struct _EmacsMenuBar
{
  GtkMenuBar parent;
} EmacsMenuBar;

typedef struct _EmacsMenuBarClass
{
  GtkMenuBarClass parent;
} EmacsMenuBarClass;

G_DEFINE_TYPE (EmacsMenuBar, emacs_menu_bar, GTK_TYPE_MENU_BAR)
#endif

Note how the the macro has no trailing semicolon.  That is required for
it to work correctly (I don't know why.)



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-16 13:29                                                           ` Po Lu
@ 2022-11-16 17:29                                                             ` Yuan Fu
  0 siblings, 0 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-16 17:29 UTC (permalink / raw)
  To: Po Lu; +Cc: Eli Zaretskii, Stefan Monnier, Theodor Thornhill, emacs-devel



> On Nov 16, 2022, at 5:29 AM, Po Lu <luangruo@yahoo.com> wrote:
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
>> DEFUN is Emacs-specific, but the other examples I've shown aren't, I
>> see them in many other programs.
> 
> Right.  What about "common" macros, such as ones that involve type
> names?
> 
> For example, here is some sample GLib code, and GLib is a library used
> widely throughout GNOME and freedesktop.org:
> 
> #ifdef HAVE_GTK3
> static void emacs_menu_bar_get_preferred_width (GtkWidget *, gint *, gint *);
> static GType emacs_menu_bar_get_type (void);
> 
> typedef struct _EmacsMenuBar
> {
>  GtkMenuBar parent;
> } EmacsMenuBar;
> 
> typedef struct _EmacsMenuBarClass
> {
>  GtkMenuBarClass parent;
> } EmacsMenuBarClass;
> 
> G_DEFINE_TYPE (EmacsMenuBar, emacs_menu_bar, GTK_TYPE_MENU_BAR)
> #endif
> 
> Note how the the macro has no trailing semicolon.  That is required for
> it to work correctly (I don't know why.)

They are fine, missing a semicolon is no big deal. On the other hand DEFUN is far enough from normal C syntax.

Yuan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-10 17:45 Tree sitter support for C-like languages Theodor Thornhill via Emacs development discussions.
                   ` (2 preceding siblings ...)
  2022-11-11  0:43 ` Randy Taylor
@ 2022-11-16 17:51 ` Yuan Fu
  2022-11-16 20:02   ` Theodor Thornhill
  3 siblings, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-16 17:51 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel

I noticed that the new fontification is much busier than cc-mode, so here’s some of my thoughts:


   :language mode
   :override t
   :feature 'expression
   '((assignment_expression
      left: (identifier) @font-lock-variable-name-face)

I think assignment should be isolated out to an “assignment” feature, where we highlight the lhs target of the assignment: the identifier, the field, etc. For example, the assignment group in Python [1]


     (call_expression
      function: (identifier) @font-lock-function-name-face)

     (field_expression
      field: (field_identifier) @font-lock-variable-name-face)

     (field_expression
      argument: (identifier) @font-lock-variable-name-face
      field: (field_identifier) @font-lock-variable-name-face)

     (pointer_expression
      argument: (identifier) @font-lock-variable-name-face))

They highlight every single use of functions and fields, so they should be level 3. (And I’ll disable them personally :-) Highlighting the field and the functions should be two different features IMO.

   :language mode
   :override t
   :feature 'statement
   '((expression_statement (identifier) @font-lock-variable-name-face)
     (labeled_statement
      label: (statement_identifier) @font-lock-type-face))

What does this rule highlight?

[1]
   :feature 'assignment
   :language 'python
   `(;; Variable names and LHS.
     (assignment left: (identifier)
                 @font-lock-variable-name-face)
     (assignment left: (attribute
                        attribute: (identifier)
                        @font-lock-variable-name-face))
     (pattern_list (identifier)
                   @font-lock-variable-name-face)
     (tuple_pattern (identifier)
                    @font-lock-variable-name-face)
     (list_pattern (identifier)
                   @font-lock-variable-name-face)
     (list_splat_pattern (identifier)
                         @font-lock-variable-name-face))

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-16 17:51 ` Yuan Fu
@ 2022-11-16 20:02   ` Theodor Thornhill
  2022-11-16 20:10     ` Yuan Fu
  2022-11-16 20:58     ` Yuan Fu
  0 siblings, 2 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-16 20:02 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel

Yuan Fu <casouri@gmail.com> writes:

> I noticed that the new fontification is much busier than cc-mode, so here’s some of my thoughts:
>

Yeah, I just made it very colorful mostly to prove the point that we
have very granular control.  I think Randys patch was very good, because
imo _most_ of the noise is the bracket/pointer/operators etc.

>
>    :language mode
>    :override t
>    :feature 'expression
>    '((assignment_expression
>       left: (identifier) @font-lock-variable-name-face)
>
> I think assignment should be isolated out to an “assignment” feature,
> where we highlight the lhs target of the assignment: the identifier,
> the field, etc. For example, the assignment group in Python [1]
>

Sure!

>
>      (call_expression
>       function: (identifier) @font-lock-function-name-face)
>
>      (field_expression
>       field: (field_identifier) @font-lock-variable-name-face)
>
>      (field_expression
>       argument: (identifier) @font-lock-variable-name-face
>       field: (field_identifier) @font-lock-variable-name-face)
>
>      (pointer_expression
>       argument: (identifier) @font-lock-variable-name-face))
>
> They highlight every single use of functions and fields, so they should be level 3. (And I’ll disable them personally :-) Highlighting the field and the functions should be two different features IMO.

Yep, I agree!  Go ahead :-)

>
>    :language mode
>    :override t
>    :feature 'statement
>    '((expression_statement (identifier) @font-lock-variable-name-face)
>      (labeled_statement
>       label: (statement_identifier) @font-lock-type-face))
>

stuff like:
```
 add_edge:  // <- this thing
  gx += WINDOW_LEFT_EDGE_X (w);
  gy += WINDOW_TOP_EDGE_Y (w);

 store_rect: // <- and this thing
  STORE_NATIVE_RECT (*rect, gx, gy, width, height);

```

I think you should just tweak it like you want.  I think it's very
time-consuming creating separate patches and bug-reports for small
tweaks and maintenance issues.  I trust your judgment here, though I
also think that Randy had some nice ideas :)

Just hack away!



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-16 20:02   ` Theodor Thornhill
@ 2022-11-16 20:10     ` Yuan Fu
  2022-11-16 20:25       ` Theodor Thornhill
  2022-11-16 20:58     ` Yuan Fu
  1 sibling, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-16 20:10 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel



> On Nov 16, 2022, at 12:02 PM, Theodor Thornhill <theo@thornhill.no> wrote:
> 
> Yuan Fu <casouri@gmail.com> writes:
> 
>> I noticed that the new fontification is much busier than cc-mode, so here’s some of my thoughts:
>> 
> 
> Yeah, I just made it very colorful mostly to prove the point that we
> have very granular control.  I think Randys patch was very good, because
> imo _most_ of the noise is the bracket/pointer/operators etc.

That seems to be a trend for tree-sitter fontification ;-) Neovim does similar things.

>> 
>>   :language mode
>>   :override t
>>   :feature 'expression
>>   '((assignment_expression
>>      left: (identifier) @font-lock-variable-name-face)
>> 
>> I think assignment should be isolated out to an “assignment” feature,
>> where we highlight the lhs target of the assignment: the identifier,
>> the field, etc. For example, the assignment group in Python [1]
>> 
> 
> Sure!
> 
>> 
>>     (call_expression
>>      function: (identifier) @font-lock-function-name-face)
>> 
>>     (field_expression
>>      field: (field_identifier) @font-lock-variable-name-face)
>> 
>>     (field_expression
>>      argument: (identifier) @font-lock-variable-name-face
>>      field: (field_identifier) @font-lock-variable-name-face)
>> 
>>     (pointer_expression
>>      argument: (identifier) @font-lock-variable-name-face))
>> 
>> They highlight every single use of functions and fields, so they should be level 3. (And I’ll disable them personally :-) Highlighting the field and the functions should be two different features IMO.
> 
> Yep, I agree!  Go ahead :-)

I should clarify that I “And I’ll disable them personally” meant “I’ll disable them in my personal config”, but anyway.

> 
>> 
>>   :language mode
>>   :override t
>>   :feature 'statement
>>   '((expression_statement (identifier) @font-lock-variable-name-face)
>>     (labeled_statement
>>      label: (statement_identifier) @font-lock-type-face))
>> 
> 
> stuff like:
> ```
> add_edge:  // <- this thing
>  gx += WINDOW_LEFT_EDGE_X (w);
>  gy += WINDOW_TOP_EDGE_Y (w);
> 
> store_rect: // <- and this thing
>  STORE_NATIVE_RECT (*rect, gx, gy, width, height);
> 
> ```
> 
> I think you should just tweak it like you want.  I think it's very
> time-consuming creating separate patches and bug-reports for small
> tweaks and maintenance issues.  I trust your judgment here, though I
> also think that Randy had some nice ideas :)
> 
> Just hack away!

Thanks! Will do.

Yuan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-16 20:10     ` Yuan Fu
@ 2022-11-16 20:25       ` Theodor Thornhill
  0 siblings, 0 replies; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-16 20:25 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel

Yuan Fu <casouri@gmail.com> writes:

>> On Nov 16, 2022, at 12:02 PM, Theodor Thornhill <theo@thornhill.no> wrote:
>> 
>> Yuan Fu <casouri@gmail.com> writes:
>> 
>>> I noticed that the new fontification is much busier than cc-mode, so here’s some of my thoughts:
>>> 
>> 
>> Yeah, I just made it very colorful mostly to prove the point that we
>> have very granular control.  I think Randys patch was very good, because
>> imo _most_ of the noise is the bracket/pointer/operators etc.
>
> That seems to be a trend for tree-sitter fontification ;-) Neovim does similar things.
>

Hehe, yep!

>> 
>> Just hack away!
>
> Thanks! Will do.
>

If you find something you feel is a general, good baseline, just apply
it to the other modes as well if your time permits :)

Theo



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-16 20:02   ` Theodor Thornhill
  2022-11-16 20:10     ` Yuan Fu
@ 2022-11-16 20:58     ` Yuan Fu
  2022-11-21  9:28       ` Yuan Fu
  1 sibling, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-16 20:58 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel

>> 
>>   :language mode
>>   :override t
>>   :feature 'statement
>>   '((expression_statement (identifier) @font-lock-variable-name-face)
>>     (labeled_statement
>>      label: (statement_identifier) @font-lock-type-face))
>> 
> 
> stuff like:
> ```
> add_edge:  // <- this thing
>  gx += WINDOW_LEFT_EDGE_X (w);
>  gy += WINDOW_TOP_EDGE_Y (w);
> 
> store_rect: // <- and this thing
>  STORE_NATIVE_RECT (*rect, gx, gy, width, height);
> 
> ```

What’s the intention of this query?

(expression_statement (identifier) @font-lock-variable-name-face)

It seems to match statements like

var;

?

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-16 20:58     ` Yuan Fu
@ 2022-11-21  9:28       ` Yuan Fu
  2022-11-21 11:15         ` Theodor Thornhill
  0 siblings, 1 reply; 83+ messages in thread
From: Yuan Fu @ 2022-11-21  9:28 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel



> On Nov 16, 2022, at 12:58 PM, Yuan Fu <casouri@gmail.com> wrote:
> 
>>> 
>>>  :language mode
>>>  :override t
>>>  :feature 'statement
>>>  '((expression_statement (identifier) @font-lock-variable-name-face)
>>>    (labeled_statement
>>>     label: (statement_identifier) @font-lock-type-face))
>>> 
>> 
>> stuff like:
>> ```
>> add_edge:  // <- this thing
>> gx += WINDOW_LEFT_EDGE_X (w);
>> gy += WINDOW_TOP_EDGE_Y (w);
>> 
>> store_rect: // <- and this thing
>> STORE_NATIVE_RECT (*rect, gx, gy, width, height);
>> 
>> ```
> 
> What’s the intention of this query?
> 
> (expression_statement (identifier) @font-lock-variable-name-face)
> 
> It seems to match statements like
> 
> var;
> 
> ?
> 
> Yuan

You probably missed this message, ping :-)

Yuan




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-21  9:28       ` Yuan Fu
@ 2022-11-21 11:15         ` Theodor Thornhill
  2022-11-23  1:55           ` Yuan Fu
  0 siblings, 1 reply; 83+ messages in thread
From: Theodor Thornhill @ 2022-11-21 11:15 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1058 bytes --]


>> On Nov 16, 2022, at 12:58 PM, Yuan Fu <casouri@gmail.com> wrote:
>> 
>>>> 
>>>>  :language mode
>>>>  :override t
>>>>  :feature 'statement
>>>>  '((expression_statement (identifier) @font-lock-variable-name-face)
>>>>    (labeled_statement
>>>>     label: (statement_identifier) @font-lock-type-face))
>>>> 
>>> 
>>> stuff like:
>>> ```
>>> add_edge:  // <- this thing
>>> gx += WINDOW_LEFT_EDGE_X (w);
>>> gy += WINDOW_TOP_EDGE_Y (w);
>>> 
>>> store_rect: // <- and this thing
>>> STORE_NATIVE_RECT (*rect, gx, gy, width, height);
>>> 
>>> ```
>> 
>> What’s the intention of this query?
>> 
>> (expression_statement (identifier) @font-lock-variable-name-face)
>> 
>> It seems to match statements like
>> 
>> var;
>> 
>> ?
>> 
>> Yuan
>
> You probably missed this message, ping :-)
>

Yeah, sorry!  I think it can be removed.  I really don't remember why I
made that...

I added the fix in a patch, and also fixed a typo (I think that messed
up some fontlocking in c-ts-mode) as an apology :-)

Theo




[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-some-font-lock-settings.patch --]
[-- Type: text/x-diff, Size: 1909 bytes --]

From 4a81b73bc97b01ea9619e517809997397213f0dc Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Mon, 21 Nov 2022 12:14:07 +0100
Subject: [PATCH] Fix some font-lock-settings

* lisp/progmodes/c-ts-mode.el (c-ts-mode--font-lock-settings): Remove
expression_statement and fix typo for the
c-ts-mode--fontify-declarator font lock helper function.
---
 lisp/progmodes/c-ts-mode.el | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index 6ba209c0fb..17f27611a5 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -263,19 +263,19 @@ c-ts-mode--font-lock-settings
       declarator: (_) @font-lock-variable-name-face)
 
      (field_declaration
-      declarator: (_) @c-ts-mode--fontify-struct-declarator)
+      declarator: (_) @c-ts-mode--fontify-declarator)
 
      (function_definition
-      declarator: (_) @c-ts-mode--fontify-struct-declarator))
+      declarator: (_) @c-ts-mode--fontify-declarator))
 
    ;; Should we highlight identifiers in the parameter list?
    ;; (parameter_declaration
-   ;;  declarator: (_) @c-ts-mode--fontify-struct-declarator))
+   ;;  declarator: (_) @c-ts-mode--fontify-declarator))
 
    :language mode
    :feature 'assignment
    ;; TODO: Recursively highlight identifiers in parenthesized
-   ;; expressions, see `c-ts-mode--fontify-struct-declarator' for
+   ;; expressions, see `c-ts-mode--fontify-declarator' for
    ;; inspiration.
    '((assignment_expression
       left: (identifier) @font-lock-variable-name-face)
@@ -299,8 +299,7 @@ c-ts-mode--font-lock-settings
 
    :language mode
    :feature 'label
-   '((expression_statement (identifier) @font-lock-variable-name-face)
-     (labeled_statement
+   '((labeled_statement
       label: (statement_identifier) @font-lock-constant-face))
 
    :language mode
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: Tree sitter support for C-like languages
  2022-11-21 11:15         ` Theodor Thornhill
@ 2022-11-23  1:55           ` Yuan Fu
  0 siblings, 0 replies; 83+ messages in thread
From: Yuan Fu @ 2022-11-23  1:55 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: emacs-devel



> On Nov 21, 2022, at 3:15 AM, Theodor Thornhill <theo@thornhill.no> wrote:
> 
>>> 
>>> On Nov 16, 2022, at 12:58 PM, Yuan Fu <casouri@gmail.com> wrote:
>>> 
>>>>> 
>>>>> :language mode
>>>>> :override t
>>>>> :feature 'statement
>>>>> '((expression_statement (identifier) @font-lock-variable-name-face)
>>>>>   (labeled_statement
>>>>>    label: (statement_identifier) @font-lock-type-face))
>>>>> 
>>>> 
>>>> stuff like:
>>>> ```
>>>> add_edge:  // <- this thing
>>>> gx += WINDOW_LEFT_EDGE_X (w);
>>>> gy += WINDOW_TOP_EDGE_Y (w);
>>>> 
>>>> store_rect: // <- and this thing
>>>> STORE_NATIVE_RECT (*rect, gx, gy, width, height);
>>>> 
>>>> ```
>>> 
>>> What’s the intention of this query?
>>> 
>>> (expression_statement (identifier) @font-lock-variable-name-face)
>>> 
>>> It seems to match statements like
>>> 
>>> var;
>>> 
>>> ?
>>> 
>>> Yuan
>> 
>> You probably missed this message, ping :-)
>> 
> 
> Yeah, sorry!  I think it can be removed.  I really don't remember why I
> made that...
> 
> I added the fix in a patch, and also fixed a typo (I think that messed
> up some fontlocking in c-ts-mode) as an apology :-)
> 
> Theo

Thanks! Though I made those changes before I got to the patch. Patch appreciated :-)

Yuan


^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2022-11-23  1:55 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-10 17:45 Tree sitter support for C-like languages Theodor Thornhill via Emacs development discussions.
2022-11-10 18:03 ` Stefan Monnier
2022-11-10 18:18   ` Eli Zaretskii
2022-11-10 18:19   ` Theodor Thornhill
2022-11-10 22:58 ` Yuan Fu
2022-11-11  5:48   ` Theodor Thornhill
2022-11-11  6:01   ` Theodor Thornhill via Emacs development discussions.
2022-11-12  5:43     ` Yuan Fu
2022-11-12  6:13       ` Po Lu
2022-11-12  6:17         ` Yuan Fu
2022-11-12  6:43           ` Po Lu
2022-11-12  6:16       ` Theodor Thornhill
2022-11-12  6:25         ` Yuan Fu
2022-11-12  6:37           ` Theodor Thornhill
2022-11-12  8:08         ` Eli Zaretskii
2022-11-12  8:42           ` Theodor Thornhill
2022-11-12  7:22       ` Theodor Thornhill via Emacs development discussions.
2022-11-12  8:05       ` Eli Zaretskii
2022-11-12  8:43         ` Theodor Thornhill
2022-11-12 12:21     ` Eli Zaretskii
2022-11-12 19:38       ` Theodor Thornhill via Emacs development discussions.
2022-11-12 19:46         ` Stefan Kangas
2022-11-12 20:03           ` Theodor Thornhill
2022-11-12 19:51         ` Eli Zaretskii
2022-11-12 20:05           ` Theodor Thornhill via Emacs development discussions.
2022-11-12 20:08             ` Yuan Fu
2022-11-12 20:14               ` Theodor Thornhill
2022-11-13  9:13                 ` Eli Zaretskii
2022-11-13  9:40                   ` Theodor Thornhill
2022-11-13  9:56                     ` Eli Zaretskii
2022-11-13 10:13                       ` Theodor Thornhill
2022-11-13 12:55                         ` Eli Zaretskii
2022-11-13 13:02                           ` Theodor Thornhill
2022-11-13 13:08                             ` Eli Zaretskii
2022-11-13 13:37                               ` Theodor Thornhill
2022-11-14  1:23                             ` Dmitry Gutov
2022-11-14  0:22                       ` Yuan Fu
2022-11-14  1:26                         ` Dmitry Gutov
2022-11-14  8:35                           ` Yuan Fu
2022-11-14 13:24                             ` Eli Zaretskii
2022-11-14 18:31                               ` Yuan Fu
2022-11-14 19:54                             ` Dmitry Gutov
2022-11-15 10:56                               ` Yuan Fu
2022-11-15 12:30                                 ` Dmitry Gutov
2022-11-14  3:48                         ` Stefan Monnier
2022-11-14  8:23                           ` Yuan Fu
2022-11-14 12:46                             ` Stefan Monnier
2022-11-14 13:20                             ` Eli Zaretskii
2022-11-14 18:29                               ` Yuan Fu
2022-11-14 18:45                                 ` Eli Zaretskii
2022-11-14 19:51                                   ` Yuan Fu
2022-11-14 20:10                                     ` Eli Zaretskii
2022-11-14 21:57                                       ` Yuan Fu
2022-11-15  3:27                                         ` Eli Zaretskii
2022-11-15 10:51                                           ` Yuan Fu
2022-11-15 11:37                                             ` Theodor Thornhill
2022-11-15 15:03                                             ` Eli Zaretskii
2022-11-15 16:01                                               ` Stefan Monnier
2022-11-15 16:59                                                 ` Eli Zaretskii
2022-11-15 18:18                                                   ` Yuan Fu
2022-11-15 18:38                                                     ` Eli Zaretskii
2022-11-16  7:58                                                       ` Yuan Fu
2022-11-16 13:16                                                         ` Eli Zaretskii
2022-11-16 13:29                                                           ` Po Lu
2022-11-16 17:29                                                             ` Yuan Fu
2022-11-15 18:27                                                   ` Visuwesh
2022-11-15 18:36                                                     ` Yuan Fu
2022-11-14 12:55                         ` Eli Zaretskii
2022-11-11  0:43 ` Randy Taylor
2022-11-11  5:50   ` Theodor Thornhill
2022-11-11 13:37     ` Stefan Monnier
2022-11-11 15:09       ` Theodor Thornhill
2022-11-11 15:54     ` Randy Taylor
2022-11-13  8:37       ` Theodor Thornhill
2022-11-13 13:03         ` Randy Taylor
2022-11-16 17:51 ` Yuan Fu
2022-11-16 20:02   ` Theodor Thornhill
2022-11-16 20:10     ` Yuan Fu
2022-11-16 20:25       ` Theodor Thornhill
2022-11-16 20:58     ` Yuan Fu
2022-11-21  9:28       ` Yuan Fu
2022-11-21 11:15         ` Theodor Thornhill
2022-11-23  1:55           ` Yuan Fu

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).