* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
@ 2024-11-29 21:57 Vincenzo Pupillo
2024-12-01 6:01 ` Yuan Fu
2024-12-04 1:27 ` Dmitry Gutov
0 siblings, 2 replies; 13+ messages in thread
From: Vincenzo Pupillo @ 2024-11-29 21:57 UTC (permalink / raw)
To: 74610
[-- Attachment #1: Type: text/plain, Size: 249 bytes --]
Ciao,
following the discussion
https://lists.gnu.org/archive/html/emacs-devel/2024-11/msg00079.html I would
like to ask if it would be possible to add to emacs this new mode for editing
html files alternative to mhtml-mode.
Thank you.
Vincenzo.
[-- Attachment #2: 0001-Add-mhtml-ts-mode.patch --]
[-- Type: text/x-patch, Size: 21245 bytes --]
From 8a1c792aaddf4daef2808f5a74212a2fb8b0a01e Mon Sep 17 00:00:00 2001
From: Vincenzo Pupillo <v.pupillo@gmail.com>
Date: Fri, 29 Nov 2024 22:48:45 +0100
Subject: [PATCH] Add mhtml-ts-mode.
New major-mode alternative to mhtml-mode, based on treesitter, for
editing files containing html, javascript and css.
* etc/NEWS: Mention the new mode.
* lisp/textmodes/mhtml-ts-mode.el: New file.
---
etc/NEWS | 8 +
lisp/textmodes/mhtml-ts-mode.el | 462 ++++++++++++++++++++++++++++++++
2 files changed, 470 insertions(+)
create mode 100644 lisp/textmodes/mhtml-ts-mode.el
diff --git a/etc/NEWS b/etc/NEWS
index 4d2a2c893d0..8f9a04dcf01 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -797,6 +797,14 @@ destination window is chosen using 'display-buffer-alist'. Example:
\f
* New Modes and Packages in Emacs 31.1
+** New major modes based on the tree-sitter library
+
++++
+*** New major mode 'mhtml-ts-mode'.
+An optional major mode based on the tree-sitter library for editing html
+files. This mode handles indentation, fontification, and commenting for
+embedded JavaScript and CSS.
+
\f
* Incompatible Lisp Changes in Emacs 31.1
diff --git a/lisp/textmodes/mhtml-ts-mode.el b/lisp/textmodes/mhtml-ts-mode.el
new file mode 100644
index 00000000000..b6b220663e3
--- /dev/null
+++ b/lisp/textmodes/mhtml-ts-mode.el
@@ -0,0 +1,462 @@
+;;; mhtml-ts-mode.el --- Major mode for HTML using tree-sitter -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2024 Free Software Foundation, Inc.
+
+;; Author: Vincenzo Pupillo <v.pupillo@gmail.com>
+;; Maintainer: Vincenzo Pupillo <v.pupillo@gmail.com>
+;; Created: Nov 2024
+;; Keywords: HTML language tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+;;
+;; This package provides `mhtml-ts-mode' which is a major mode
+;; for editing HTML files with embedded JavaScript and CSS.
+;; Tree Sitter is used to parse each of these languages.
+;;
+;; Please note that this package requires `html-ts-mode', which
+;; registers itself as the major mode for editing HTML.
+;;
+;; This package is compatible and has been tested with the following
+;; tree-sitter grammars:
+;; * https://github.com/tree-sitter/tree-sitter-html
+;; * https://github.com/tree-sitter/tree-sitter-javascript
+;; * https://github.com/tree-sitter/tree-sitter-jsdoc
+;; * https://github.com/tree-sitter/tree-sitter-css
+;;
+;; Features
+;;
+;; * Indent
+;; * IMenu
+;; * Navigation
+;; * Which-function
+;; * Tree-sitter parser installation helper
+
+;;; Code:
+
+(require 'treesit)
+(require 'html-ts-mode)
+(require 'css-mode) ;; for embed css into html
+(require 'js) ;; for embed javascript into html
+
+(eval-when-compile
+ (require 'rx))
+
+;; Declare all native functions used by the major mode.
+;; This tells the byte-compiler where the functions are defined.
+(declare-function treesit-node-end "treesit.c")
+(declare-function treesit-node-parent "treesit.c")
+(declare-function treesit-node-start "treesit.c")
+(declare-function treesit-node-type "treesit.c")
+(declare-function treesit-parser-create "treesit.c")
+
+;; In a multi-language major mode can be useful to have an "installer" to
+;; simplify the installation of the grammars supported by the major-mode.
+(defvar mhtml-ts-mode--language-source-alist
+ '((html . ("https://github.com/tree-sitter/tree-sitter-html" "v0.23.0"))
+ (javascript . ("https://github.com/tree-sitter/tree-sitter-javascript" "v0.23.0"))
+ (jsdoc . ("https://github.com/tree-sitter/tree-sitter-jsdoc" "v0.23.0"))
+ (css . ("https://github.com/tree-sitter/tree-sitter-css" "v0.23.0")))
+ "Treesitter language parsers required by `mhtml-ts-mode'.
+You can customize this variable if you want to stick to a specific
+commit and/or use different parsers.")
+
+(defun mhtml-ts-mode-install-parsers ()
+ "Install all the required treesitter parsers.
+`mhtml-ts-mode--language-source-alist' defines which parsers to install."
+ (interactive)
+ (let ((treesit-language-source-alist mhtml-ts-mode--language-source-alist))
+ (dolist (item mhtml-ts-mode--language-source-alist)
+ (treesit-install-language-grammar (car item)))))
+
+;;; Custom variables
+
+(defgroup mhtml-ts-mode nil
+ "Major mode for editing HTML files, based on `html-ts-mode'.
+Works with JS and CSS and for that use `js-ts-mode' and `css-ts-mode'."
+ :prefix "html-ts-mode-"
+ :group 'languages)
+
+(defcustom mhtml-ts-mode-js-css-indent-offset 2
+ "JavaScript and CSS indent spaces related to the <script> and <style> HTML tags.
+By default should have same value as `html-ts-mode-indent-offset'."
+ :tag "HTML javascript or css indent offset"
+ :version "31.1"
+ :type 'integer
+ :safe 'integerp)
+
+(defvar mhtml-ts-mode--js-css-indent-offset
+ mhtml-ts-mode-js-css-indent-offset
+ "Internal copy of `mhtml-ts-mode-js-css-indent-offset'.
+The value changes, by `mhtml-ts-mode--tag-relative-indent-offset' according to
+the value of `mhtml-ts-mode-tag-relative-indent'.")
+
+(defun mhtml-ts-mode--tag-relative-indent-offset (sym val)
+ "Custom setter for `mhtml-ts-mode-tag-relative-indent'.
+
+Apart from setting the default value of SYM to VAL, also change the
+value of SYM in `mhtml-ts-mode' buffers to VAL. SYM should be
+`mhtml-ts-mode-tag-relative-indent', and VAL should be t, nil or
+`ignore'. When sym is `mhtml-ts-mode-tag-relative-indent' set the
+value of `mhtml-ts-mode--js-css-indent-offset' to 0 if VAL is t,
+otherwise to `mhtml-ts-mode-js-css-indent-offset'."
+ (set-default sym val)
+ (when (eq sym 'mhtml-ts-mode-tag-relative-indent)
+ (setq-local
+ mhtml-ts-mode--js-css-indent-offset
+ (if (eq val t)
+ mhtml-ts-mode-js-css-indent-offset
+ 0))))
+
+(defcustom mhtml-ts-mode-tag-relative-indent t
+ "How <script> and <style> bodies are indented relative to the tag.
+
+When t, indentation looks like:
+
+ <script>
+ code();
+ </script>
+
+When nil, indentation of the script body starts just below the
+tag, like:
+
+ <script>
+ code();
+ </script>
+
+When `ignore', the script body starts in the first column, like:
+
+ <script>
+code();
+ </script>"
+ :type '(choice (const nil) (const t) (const ignore))
+ :safe 'symbolp
+ :set #'mhtml-ts-mode--tag-relative-indent-offset
+ :version "31.1")
+
+(defcustom mhtml-ts-mode-css-fontify-colors t
+ "Whether CSS colors should be fontified using the color as the background.
+If non-nil, text representing a CSS color will be fontified
+such that its background is the color itself.
+Works like `css--fontify-region'."
+ :tag "HTML colors the CSS properties values."
+ :version "31.1"
+ :type 'boolean
+ :safe 'booleanp)
+
+;; To enable some basic treesiter functionality, you should define
+;; a function that recognizes which grammar is used at-point.
+;; This function should be assigned to `treesit-language-at-point-function'
+(defun mhtml-ts-mode--language-at-point (point)
+ "Return the language at POINT assuming the point is within a HTML buffer."
+ (let* ((node (treesit-node-at point 'html))
+ (parent (treesit-node-parent node))
+ (node-query (format "(%s (%s))"
+ (treesit-node-type parent)
+ (treesit-node-type node))))
+ (cond
+ ((string-equal "(script_element (raw_text))" node-query) 'javascript)
+ ((string-equal "(style_element (raw_text))" node-query) 'css)
+ (t 'html))))
+
+;; Sometimes you need to override some property attached to a node.
+;; The signature of the function should be conforming to signature
+;; QUERY-SPEC required by `treesit-font-lock-rules'.
+(defun mhtml-ts-mode--colorize-css-value (node override start end &rest _)
+ "Colorize CSS property value like `css--fontify-region'.
+For NODE, OVERRIDE, START, and END, see `treesit-font-lock-rules'."
+ (if (and mhtml-ts-mode-css-fontify-colors
+ (string-equal "plain_value" (treesit-node-type node)))
+ (let ((color (css--compute-color start (treesit-node-text node t))))
+ (when color
+ (with-silent-modifications
+ (add-text-properties
+ (treesit-node-start node) (treesit-node-end node)
+ (list 'face (list :background color
+ :foreground (readable-foreground-color
+ color)
+ :box '(:line-width -1)))))))
+ (treesit-fontify-with-override
+ (treesit-node-start node) (treesit-node-end node)
+ 'font-lock-variable-name-face
+ override start end)))
+
+;; Embedded languages should be indented according to the language
+;; that embeds them.
+;; This function signature complies with `treesit-simple-indent-rules'
+;; ANCHOR.
+(defun mhtml-ts-mode--js-css-tag-bol (_node _parent &rest _)
+ "Find the first non-space characters of html tags <script> or <style>.
+Return `line-beginning-position' when `treesit-node-at' is html, or
+`mhtml-ts-mode-tag-relative-indent' is equal to ignore.
+NODE and PARENT are ignored."
+ (if (or (eq (treesit-language-at (point)) 'html)
+ (eq mhtml-ts-mode-tag-relative-indent 'ignore))
+ (line-beginning-position)
+ ;; Ok, we are in js or css block.
+ (save-excursion
+ (re-search-backward "<script.*>\\|<style.*>" nil t))))
+
+;; Treesit supports 4 level of decoration, `treesit-font-lock-level'
+;; define which level use. Major-modes categorize their fontification
+;; features, these categories are defined by `treesit-font-lock-rules' of
+;; each major-mode using :feature keyword.
+;; In a multiple language major-mode it's a good idea to provvide, for each
+;; level, the union of the :feature of the same level.
+(defvar mhtml-ts-mode--feature-list
+ '(;; level 1
+ (;; common
+ comment definition
+ ;; JS specific
+ document
+ ;; CSS specific
+ query selector)
+ ;; level 2
+ (keyword name property string type)
+ ;; level 3
+ (;; common
+ attribute assignment constant escape-sequence
+ base-clause literal variable-name variable
+ ;; Javascript specific
+ jsx number pattern string-interpolation)
+ ;; level 4
+ (bracket delimiter error operator function)))
+
+;; In order to support wich-fuction-mode we should define
+;; a function that return the defun name.
+;; In a multilingual treesit mode, this can be implemented simply by
+;; calling language-specific functions.
+(defun mhtml-ts-mode--defun-name (node)
+ "Return the defun name of NODE.
+Return nil if there is no name or if NODE is not a defun node."
+ ;; (message "node type ""%s""" (treesit-node-type node))
+ (let ((lang (mhtml-ts-mode--language-at-point (point))))
+ (cond
+ ((eq lang 'html) (html-ts-mode--defun-name node))
+ ((eq lang 'javascript) (js--treesit-defun-name node))
+ ((eq lang 'css) (css--treesit-defun-name node)))))
+
+(define-derived-mode mhtml-ts-mode html-mode
+ '("HTML+" (:eval (let ((lang (mhtml-ts-mode--language-at-point (point))))
+ (cond ((eq lang 'html) "")
+ ((eq lang 'javascript) "JS")
+ ((eq lang 'css) "CSS")))))
+ "Major mode for editing HTML with embedded JavaScript and CSS.
+Powered by tree-sitter."
+ (if (not (and
+ (treesit-ready-p 'html)
+ (treesit-ready-p 'javascript)
+ (treesit-ready-p 'css)))
+ (error "Tree-sitter parsers for HTML isn't
+ available. You can install the parsers with M-x
+ `mhtml-ts-mode-install-parsers'")
+
+ ;; When an language is embedded, you should initialize some variable
+ ;; just like it's done in the original mode.
+
+ ;; Comment.
+ ;; indenting settings for js-ts-mode.
+ (c-ts-common-comment-setup)
+ (setq-local comment-multi-line t)
+
+ ;; Font-lock.
+
+ ;; There are two kind of treesitter parser:
+ ;; 1. global parser
+ ;; 2. local parser
+ ;; The global parser considers each piece of text,
+ ;; in a multilingual buffer, as if it were a single buffer in its
+ ;; own language. Local parsers, on the other hand, consider each
+ ;; piece of text, in a multilingual buffer, as if they were
+ ;; separate buffers.
+ ;; In a multilingual buffer you should create only global ones.
+ ;; Local ones are created automatically.
+ ;; Warning: do not create a local parser! It may cause side
+ ;; effects that are difficult to handle.
+
+ ;; There are two types of treesitter parsers:
+ ;; 1. global parsers
+ ;; 2. local parsers
+ ;; A global parser treats each piece of text,
+ ;; in a multilingual buffer, as if it were a single buffer in its
+ ;; language. Local parser, on the other hand, treat each
+ ;; piece of text, in a multilingual buffer, as if they were separate buffers.
+ ;; In a multilingual buffer you should only create global ones.
+ ;; The local ones are created automatically.
+ ;; Warning: do not create a local parser! It may cause side effects that are difficult to handle.
+
+ ;; Create the parsers, only the global one.
+ ;; jsdoc is a local parser, don't create a parser for it.
+ (treesit-parser-create 'css)
+ (treesit-parser-create 'javascript)
+
+ ;; jsdoc is not mandatory for js-ts-mode, so we respect this by
+ ;; adding jsdoc range rules only when jsdoc is available.
+ (if (treesit-ready-p 'jsdoc t)
+ (setq-local treesit-range-settings
+ (treesit-range-rules
+ :embed 'javascript
+ :host 'html
+ :offset '(1 . -1)
+ '((script_element
+ (start_tag (tag_name))
+ (raw_text) @cap))
+
+ :embed 'jsdoc
+ :host 'javascript
+ :local t
+ `(((comment) @cap
+ (:match ,js--treesit-jsdoc-beginning-regexp @cap)))
+
+ :embed 'css
+ :host 'html
+ :offset '(1 . -1)
+ '((style_element
+ (start_tag (tag_name))
+ (raw_text) @cap))))
+ (setq-local treesit-range-settings
+ (treesit-range-rules
+ :embed 'javascript
+ :host 'html
+ :offset '(1 . -1)
+ '((script_element
+ (start_tag (tag_name))
+ (raw_text) @cap))
+
+ :embed 'css
+ :host 'html
+ :offset '(1 . -1)
+ '((style_element
+ (start_tag (tag_name))
+ (raw_text) @cap)))))
+
+ ;; Many treesit fuctions need to know the language at-point.
+ ;; So you should define such a function.
+ (setq-local treesit-language-at-point-function #'mhtml-ts-mode--language-at-point)
+
+ ;; Indent.
+
+ ;; Since mhtl-ts-mode inherits indentation rules from html-ts-mode, js
+ ;; and css, if you want to change the offset you have to act on the
+ ;; *-offset variables defined for those languages.
+
+ ;; JavaScript and CSS must be indented relative to their code block.
+ ;; This is done by inserting a special rule before the normal
+ ;; indentation rules of these languages.
+ ;; The value of mhtml-ts-mode--js-css-indent-offset changes based on
+ ;; mhtml-ts-mode-tag-relative-indent and can be used to indent
+ ;; JavaScript and CSS code relative to the HTML that contains them,
+ ;; just like in mhtml-mode.
+ (setq-local treesit-simple-indent-rules
+ (append html-ts-mode--indent-rules
+ ;; Extended rules for js and css, to
+ ;; indent appropriately when injected
+ ;; into html
+ `((javascript ((parent-is "program")
+ mhtml-ts-mode--js-css-tag-bol
+ mhtml-ts-mode--js-css-indent-offset)
+ ,@(cdr (car js--treesit-indent-rules))))
+ `((css ((parent-is "stylesheet")
+ mhtml-ts-mode--js-css-tag-bol
+ mhtml-ts-mode--js-css-indent-offset)
+ ,@(cdr (car css--treesit-indent-rules))))))
+ ;; Navigation.
+
+ ;; This regular expression tells treesit how to match the node type
+ ;; of defun nodes.
+ ;; Used by `treesit-beginning-of-defun' and friends for
+ ;; navigations.
+ (setq-local treesit-defun-type-regexp
+ (rx (or
+ ;; Javascript
+ "class_declaration"
+ "method_definition"
+ "function_declaration"
+ "lexical_declaration"
+ ;; HTML
+ "element"
+ ;; CSS
+ "rule_set")))
+
+ ;; This is for finding defun name, it's used by IMenu as default
+ ;; function no specific functions are defined.
+ (setq-local treesit-defun-name-function #'mhtml-ts-mode--defun-name)
+
+ ;; Define what are 'thing' for treesit.
+ ;; 'Thing' is a symbol representing the thing, like `defun', `sexp', or
+ ;; `sentence'.
+ (setq-local treesit-thing-settings
+ `((html
+ (sexp ,(regexp-opt '("element"
+ "text"
+ "attribute"
+ "value")))
+ (sentence "tag")
+ (text ,(regexp-opt '("comment" "text"))))
+ (javascript
+ (sexp ,(js--regexp-opt-symbol js--treesit-sexp-nodes))
+ (sentence ,(js--regexp-opt-symbol js--treesit-sentence-nodes))
+ (text ,(js--regexp-opt-symbol '("comment"
+ "string_fragment"))))))
+
+ ;; Font-lock.
+
+ ;; In a multi-language scenario, font lock settings are usually a
+ ;; concatenation of language rules. As you can see, it is possible
+ ;; to extend/modify the default rule or use a different set of
+ ;; rules. See `php-ts-mode--custom-html-font-lock-settings' for more
+ ;; advanced usage.
+ (setq-local treesit-font-lock-settings
+ (append html-ts-mode--font-lock-settings
+ js--treesit-font-lock-settings
+ (append
+ ;; Rule for coloring CSS property values.
+ ;; Placed before `css--treesit-settings'
+ ;; to win against the same rule contained therein.
+ (treesit-font-lock-rules
+ :language 'css
+ :override t
+ :feature 'variable
+ '((plain_value) @mhtml-ts-mode--colorize-css-value))
+ css--treesit-settings)))
+
+ ;; Tells treesit the list of features to fontify.
+ (setq-local treesit-font-lock-feature-list mhtml-ts-mode--feature-list)
+
+ ;; Imenu
+
+ ;; Setup Imenu: if no function is specified, try to find an object
+ ;; using `treesit-defun-name-function'.
+ ;; TODO: we need to see if it is possible to extend Imenu to
+ ;; embedded languages as well.
+ (setq-local treesit-simple-imenu-settings
+ `(("Element" "\\`tag_name\\'" nil nil)))
+
+ ;; This should be the last thing to do.
+ ;; Treesit tries to find out what the primary language is, but it is better
+ ;; to say it explicitly.
+ (setq-local treesit-primary-parser (treesit-parser-create 'html))
+
+ (treesit-font-lock-recompute-features)
+ (treesit-major-mode-setup)))
+
+(when (and (treesit-ready-p 'html) (treesit-ready-p 'javascript) (treesit-ready-p 'css))
+ (add-to-list
+ 'auto-mode-alist '("\\.[sx]?html?\\(\\.[a-zA-Z_]+\\)?\\'" . mhtml-ts-mode)))
+
+(provide 'mhtml-ts-mode)
+;;; mhtml-ts-mode.el ends here
--
2.47.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-11-29 21:57 bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode Vincenzo Pupillo
@ 2024-12-01 6:01 ` Yuan Fu
2024-12-01 8:00 ` Eli Zaretskii
2024-12-03 14:29 ` Vincenzo Pupillo
2024-12-04 1:27 ` Dmitry Gutov
1 sibling, 2 replies; 13+ messages in thread
From: Yuan Fu @ 2024-12-01 6:01 UTC (permalink / raw)
To: Vincenzo Pupillo; +Cc: 74610
> On Nov 29, 2024, at 1:57 PM, Vincenzo Pupillo <v.pupillo@gmail.com> wrote:
>
> Ciao,
> following the discussion
> https://lists.gnu.org/archive/html/emacs-devel/2024-11/msg00079.html I would
> like to ask if it would be possible to add to emacs this new mode for editing
> html files alternative to mhtml-mode.
>
> Thank you.
>
> Vincenzo.<0001-Add-mhtml-ts-mode.patch>
Thank you so much! This will be very helpful for others. Here’re some comments.
Yuan
From 8a1c792aaddf4daef2808f5a74212a2fb8b0a01e Mon Sep 17 00:00:00 2001
From: Vincenzo Pupillo <v.pupillo@gmail.com>
Date: Fri, 29 Nov 2024 22:48:45 +0100
Subject: [PATCH] Add mhtml-ts-mode.
New major-mode alternative to mhtml-mode, based on treesitter, for
editing files containing html, javascript and css.
* etc/NEWS: Mention the new mode.
* lisp/textmodes/mhtml-ts-mode.el: New file.
---
etc/NEWS | 8 +
lisp/textmodes/mhtml-ts-mode.el | 462 ++++++++++++++++++++++++++++++++
2 files changed, 470 insertions(+)
create mode 100644 lisp/textmodes/mhtml-ts-mode.el
diff --git a/etc/NEWS b/etc/NEWS
index 4d2a2c893d0..8f9a04dcf01 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -797,6 +797,14 @@ destination window is chosen using 'display-buffer-alist'. Example:
\f
* New Modes and Packages in Emacs 31.1
+** New major modes based on the tree-sitter library
+
++++
+*** New major mode 'mhtml-ts-mode'.
+An optional major mode based on the tree-sitter library for editing html
+files. This mode handles indentation, fontification, and commenting for
+embedded JavaScript and CSS.
+
\f
* Incompatible Lisp Changes in Emacs 31.1
diff --git a/lisp/textmodes/mhtml-ts-mode.el b/lisp/textmodes/mhtml-ts-mode.el
new file mode 100644
index 00000000000..b6b220663e3
--- /dev/null
+++ b/lisp/textmodes/mhtml-ts-mode.el
@@ -0,0 +1,100 @@
+;;; mhtml-ts-mode.el --- Major mode for HTML using tree-sitter -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2024 Free Software Foundation, Inc.
+
+;; Author: Vincenzo Pupillo <v.pupillo@gmail.com>
+;; Maintainer: Vincenzo Pupillo <v.pupillo@gmail.com>
+;; Created: Nov 2024
+;; Keywords: HTML language tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+;;
+;; This package provides `mhtml-ts-mode' which is a major mode
+;; for editing HTML files with embedded JavaScript and CSS.
+;; Tree Sitter is used to parse each of these languages.
+;;
+;; Please note that this package requires `html-ts-mode', which
+;; registers itself as the major mode for editing HTML.
+;;
+;; This package is compatible and has been tested with the following
+;; tree-sitter grammars:
+;; * https://github.com/tree-sitter/tree-sitter-html
+;; * https://github.com/tree-sitter/tree-sitter-javascript
+;; * https://github.com/tree-sitter/tree-sitter-jsdoc
+;; * https://github.com/tree-sitter/tree-sitter-css
+;;
+;; Features
+;;
+;; * Indent
+;; * IMenu
+;; * Navigation
+;; * Which-function
+;; * Tree-sitter parser installation helper
+
+;;; Code:
+
+(require 'treesit)
+(require 'html-ts-mode)
+(require 'css-mode) ;; for embed css into html
+(require 'js) ;; for embed javascript into html
+
+(eval-when-compile
+ (require 'rx))
+
+;; Declare all native functions used by the major mode.
+;; This tells the byte-compiler where the functions are defined.
+(declare-function treesit-node-end "treesit.c")
+(declare-function treesit-node-parent "treesit.c")
+(declare-function treesit-node-start "treesit.c")
+(declare-function treesit-node-type "treesit.c")
+(declare-function treesit-parser-create "treesit.c")
+
+;; In a multi-language major mode can be useful to have an "installer" to
+;; simplify the installation of the grammars supported by the major-mode.
+(defvar mhtml-ts-mode--language-source-alist
+ '((html . ("https://github.com/tree-sitter/tree-sitter-html" "v0.23.0"))
+ (javascript . ("https://github.com/tree-sitter/tree-sitter-javascript" "v0.23.0"))
+ (jsdoc . ("https://github.com/tree-sitter/tree-sitter-jsdoc" "v0.23.0"))
+ (css . ("https://github.com/tree-sitter/tree-sitter-css" "v0.23.0")))
+ "Treesitter language parsers required by `mhtml-ts-mode'.
+You can customize this variable if you want to stick to a specific
+commit and/or use different parsers.")
+
+(defun mhtml-ts-mode-install-parsers ()
+ "Install all the required treesitter parsers.
+`mhtml-ts-mode--language-source-alist' defines which parsers to install."
+ (interactive)
+ (let ((treesit-language-source-alist mhtml-ts-mode--language-source-alist))
+ (dolist (item mhtml-ts-mode--language-source-alist)
+ (treesit-install-language-grammar (car item)))))
+
+;;; Custom variables
+
+(defgroup mhtml-ts-mode nil
+ "Major mode for editing HTML files, based on `html-ts-mode'.
+Works with JS and CSS and for that use `js-ts-mode' and `css-ts-mode'."
+ :prefix "html-ts-mode-"
+ :group 'languages)
+
+(defcustom mhtml-ts-mode-js-css-indent-offset 2
+ "JavaScript and CSS indent spaces related to the <script> and <style> HTML tags.
+By default should have same value as `html-ts-mode-indent-offset'."
+ :tag "HTML javascript or css indent offset"
+ :version "31.1"
+ :type 'integer
+ :safe 'integerp)
It's not uncommon to see different indent offset for CSS and
Javascript, so it's a good idea to have separate control for them.
+
+(defvar mhtml-ts-mode--js-css-indent-offset
+ mhtml-ts-mode-js-css-indent-offset
+ "Internal copy of `mhtml-ts-mode-js-css-indent-offset'.
+The value changes, by `mhtml-ts-mode--tag-relative-indent-offset' according to
+the value of `mhtml-ts-mode-tag-relative-indent'.")
+
+(defun mhtml-ts-mode--tag-relative-indent-offset (sym val)
+ "Custom setter for `mhtml-ts-mode-tag-relative-indent'.
+
+Apart from setting the default value of SYM to VAL, also change the
+value of SYM in `mhtml-ts-mode' buffers to VAL. SYM should be
+`mhtml-ts-mode-tag-relative-indent', and VAL should be t, nil or
+`ignore'. When sym is `mhtml-ts-mode-tag-relative-indent' set the
+value of `mhtml-ts-mode--js-css-indent-offset' to 0 if VAL is t,
+otherwise to `mhtml-ts-mode-js-css-indent-offset'."
+ (set-default sym val)
+ (when (eq sym 'mhtml-ts-mode-tag-relative-indent)
+ (setq-local
+ mhtml-ts-mode--js-css-indent-offset
+ (if (eq val t)
+ mhtml-ts-mode-js-css-indent-offset
+ 0))))
+
+(defcustom mhtml-ts-mode-tag-relative-indent t
+ "How <script> and <style> bodies are indented relative to the tag.
+
+When t, indentation looks like:
+
+ <script>
+ code();
+ </script>
+
+When nil, indentation of the script body starts just below the
+tag, like:
+
+ <script>
+ code();
+ </script>
+
+When `ignore', the script body starts in the first column, like:
+
+ <script>
+code();
+ </script>"
+ :type '(choice (const nil) (const t) (const ignore))
+ :safe 'symbolp
+ :set #'mhtml-ts-mode--tag-relative-indent-offset
+ :version "31.1")
+
+(defcustom mhtml-ts-mode-css-fontify-colors t
+ "Whether CSS colors should be fontified using the color as the background.
+If non-nil, text representing a CSS color will be fontified
+such that its background is the color itself.
+Works like `css--fontify-region'."
+ :tag "HTML colors the CSS properties values."
+ :version "31.1"
+ :type 'boolean
+ :safe 'booleanp)
+
+;; To enable some basic treesiter functionality, you should define
+;; a function that recognizes which grammar is used at-point.
+;; This function should be assigned to `treesit-language-at-point-function'
+(defun mhtml-ts-mode--language-at-point (point)
+ "Return the language at POINT assuming the point is within a HTML buffer."
+ (let* ((node (treesit-node-at point 'html))
+ (parent (treesit-node-parent node))
+ (node-query (format "(%s (%s))"
+ (treesit-node-type parent)
+ (treesit-node-type node))))
+ (cond
+ ((string-equal "(script_element (raw_text))" node-query) 'javascript)
+ ((string-equal "(style_element (raw_text))" node-query) 'css)
+ (t 'html))))
+
+;; Sometimes you need to override some property attached to a node.
+;; The signature of the function should be conforming to signature
+;; QUERY-SPEC required by `treesit-font-lock-rules'.
"property attached to a node" is vague. I would just say
;; Custom font-lock function that's used to apply color to css color
;; values. This function is used below where we define font-lock rules.
+(defun mhtml-ts-mode--colorize-css-value (node override start end &rest _)
+ "Colorize CSS property value like `css--fontify-region'.
+For NODE, OVERRIDE, START, and END, see `treesit-font-lock-rules'."
+ (if (and mhtml-ts-mode-css-fontify-colors
+ (string-equal "plain_value" (treesit-node-type node)))
+ (let ((color (css--compute-color start (treesit-node-text node t))))
+ (when color
+ (with-silent-modifications
+ (add-text-properties
+ (treesit-node-start node) (treesit-node-end node)
+ (list 'face (list :background color
+ :foreground (readable-foreground-color
+ color)
+ :box '(:line-width -1)))))))
+ (treesit-fontify-with-override
+ (treesit-node-start node) (treesit-node-end node)
+ 'font-lock-variable-name-face
+ override start end)))
+
+;; Embedded languages should be indented according to the language
+;; that embeds them.
+;; This function signature complies with `treesit-simple-indent-rules'
+;; ANCHOR.
+(defun mhtml-ts-mode--js-css-tag-bol (_node _parent &rest _)
+ "Find the first non-space characters of html tags <script> or <style>.
+Return `line-beginning-position' when `treesit-node-at' is html, or
+`mhtml-ts-mode-tag-relative-indent' is equal to ignore.
+NODE and PARENT are ignored."
+ (if (or (eq (treesit-language-at (point)) 'html)
+ (eq mhtml-ts-mode-tag-relative-indent 'ignore))
+ (line-beginning-position)
+ ;; Ok, we are in js or css block.
+ (save-excursion
+ (re-search-backward "<script.*>\\|<style.*>" nil t))))
+
+;; Treesit supports 4 level of decoration, `treesit-font-lock-level'
+;; define which level use. Major-modes categorize their fontification
+;; features, these categories are defined by `treesit-font-lock-rules' of
+;; each major-mode using :feature keyword.
+;; In a multiple language major-mode it's a good idea to provvide, for each
+;; level, the union of the :feature of the same level.
"which level to use", "Major modes", and "provide"
+(defvar mhtml-ts-mode--feature-list
+ '(;; level 1
+ (;; common
+ comment definition
+ ;; JS specific
+ document
+ ;; CSS specific
+ query selector)
+ ;; level 2
+ (keyword name property string type)
+ ;; level 3
+ (;; common
+ attribute assignment constant escape-sequence
+ base-clause literal variable-name variable
+ ;; Javascript specific
+ jsx number pattern string-interpolation)
+ ;; level 4
+ (bracket delimiter error operator function)))
+
+;; In order to support wich-fuction-mode we should define
"which-function-mode"
+;; a function that return the defun name.
+;; In a multilingual treesit mode, this can be implemented simply by
+;; calling language-specific functions.
+(defun mhtml-ts-mode--defun-name (node)
+ "Return the defun name of NODE.
+Return nil if there is no name or if NODE is not a defun node."
+ ;; (message "node type ""%s""" (treesit-node-type node))
+ (let ((lang (mhtml-ts-mode--language-at-point (point))))
+ (cond
+ ((eq lang 'html) (html-ts-mode--defun-name node))
+ ((eq lang 'javascript) (js--treesit-defun-name node))
+ ((eq lang 'css) (css--treesit-defun-name node)))))
+
+(define-derived-mode mhtml-ts-mode html-mode
+ '("HTML+" (:eval (let ((lang (mhtml-ts-mode--language-at-point (point))))
+ (cond ((eq lang 'html) "")
+ ((eq lang 'javascript) "JS")
+ ((eq lang 'css) "CSS")))))
+ "Major mode for editing HTML with embedded JavaScript and CSS.
+Powered by tree-sitter."
+ (if (not (and
+ (treesit-ready-p 'html)
+ (treesit-ready-p 'javascript)
+ (treesit-ready-p 'css)))
+ (error "Tree-sitter parsers for HTML isn't
+ available. You can install the parsers with M-x
+ `mhtml-ts-mode-install-parsers'")
+
+ ;; When an language is embedded, you should initialize some variable
+ ;; just like it's done in the original mode.
+
+ ;; Comment.
+ ;; indenting settings for js-ts-mode.
+ (c-ts-common-comment-setup)
+ (setq-local comment-multi-line t)
+
+ ;; Font-lock.
+
+ ;; There are two kind of treesitter parser:
+ ;; 1. global parser
+ ;; 2. local parser
+ ;; The global parser considers each piece of text,
+ ;; in a multilingual buffer, as if it were a single buffer in its
+ ;; own language. Local parsers, on the other hand, consider each
+ ;; piece of text, in a multilingual buffer, as if they were
+ ;; separate buffers.
+ ;; In a multilingual buffer you should create only global ones.
+ ;; Local ones are created automatically.
+ ;; Warning: do not create a local parser! It may cause side
+ ;; effects that are difficult to handle.
+
+ ;; There are two types of treesitter parsers:
+ ;; 1. global parsers
+ ;; 2. local parsers
+ ;; A global parser treats each piece of text,
+ ;; in a multilingual buffer, as if it were a single buffer in its
+ ;; language. Local parser, on the other hand, treat each
+ ;; piece of text, in a multilingual buffer, as if they were separate buffers.
+ ;; In a multilingual buffer you should only create global ones.
+ ;; The local ones are created automatically.
+ ;; Warning: do not create a local parser! It may cause side effects that are difficult to handle.
Seems like a duplicate? And I want to highlight the fact that we're
talking about embedded parsers here, so I would say:
There are two ways to handle embedded code:
1. Use a single parser for all the embedded code in the buffer. In
this case, the embedded code blocks are concatenated together and are
seen as a single continuous document to the parser.
2. Each embedded code block gets its own parser. Each parser only sees
that particular code block.
If you go with 2 for a language, the local parsers are created and
destroyed automatically by Emacs. So don't create a global parser for
that embedded language here.
+
+ ;; Create the parsers, only the global one.
"only global ones", I think
+ ;; jsdoc is a local parser, don't create a parser for it.
+ (treesit-parser-create 'css)
+ (treesit-parser-create 'javascript)
+
+ ;; jsdoc is not mandatory for js-ts-mode, so we respect this by
+ ;; adding jsdoc range rules only when jsdoc is available.
+ (if (treesit-ready-p 'jsdoc t)
+ (setq-local treesit-range-settings
+ (treesit-range-rules
+ :embed 'javascript
+ :host 'html
+ :offset '(1 . -1)
+ '((script_element
+ (start_tag (tag_name))
+ (raw_text) @cap))
+
+ :embed 'jsdoc
+ :host 'javascript
+ :local t
+ `(((comment) @cap
+ (:match ,js--treesit-jsdoc-beginning-regexp @cap)))
+
+ :embed 'css
+ :host 'html
+ :offset '(1 . -1)
+ '((style_element
+ (start_tag (tag_name))
+ (raw_text) @cap))))
+ (setq-local treesit-range-settings
+ (treesit-range-rules
+ :embed 'javascript
+ :host 'html
+ :offset '(1 . -1)
+ '((script_element
+ (start_tag (tag_name))
+ (raw_text) @cap))
+
+ :embed 'css
+ :host 'html
+ :offset '(1 . -1)
+ '((style_element
+ (start_tag (tag_name))
+ (raw_text) @cap)))))
You can create range rules for each language and append them, that way
you don't need to duplicate the code here. It's just like the
font-lock settings.
+
+ ;; Many treesit fuctions need to know the language at-point.
+ ;; So you should define such a function.
+ (setq-local treesit-language-at-point-function #'mhtml-ts-mode--language-at-point)
+
+ ;; Indent.
+
+ ;; Since mhtl-ts-mode inherits indentation rules from html-ts-mode, js
+ ;; and css, if you want to change the offset you have to act on the
+ ;; *-offset variables defined for those languages.
+
+ ;; JavaScript and CSS must be indented relative to their code block.
+ ;; This is done by inserting a special rule before the normal
+ ;; indentation rules of these languages.
+ ;; The value of mhtml-ts-mode--js-css-indent-offset changes based on
+ ;; mhtml-ts-mode-tag-relative-indent and can be used to indent
+ ;; JavaScript and CSS code relative to the HTML that contains them,
+ ;; just like in mhtml-mode.
+ (setq-local treesit-simple-indent-rules
+ (append html-ts-mode--indent-rules
+ ;; Extended rules for js and css, to
+ ;; indent appropriately when injected
+ ;; into html
+ `((javascript ((parent-is "program")
+ mhtml-ts-mode--js-css-tag-bol
+ mhtml-ts-mode--js-css-indent-offset)
+ ,@(cdr (car js--treesit-indent-rules))))
+ `((css ((parent-is "stylesheet")
+ mhtml-ts-mode--js-css-tag-bol
+ mhtml-ts-mode--js-css-indent-offset)
+ ,@(cdr (car css--treesit-indent-rules))))))
+ ;; Navigation.
+
+ ;; This regular expression tells treesit how to match the node type
+ ;; of defun nodes.
+ ;; Used by `treesit-beginning-of-defun' and friends for
+ ;; navigations.
+ (setq-local treesit-defun-type-regexp
+ (rx (or
+ ;; Javascript
+ "class_declaration"
+ "method_definition"
+ "function_declaration"
+ "lexical_declaration"
+ ;; HTML
+ "element"
+ ;; CSS
+ "rule_set")))
You can actually define a defun "thing" in treesit-thing-setting, and
it should work the same.
+ ;; This is for finding defun name, it's used by IMenu as default
+ ;; function no specific functions are defined.
+ (setq-local treesit-defun-name-function #'mhtml-ts-mode--defun-name)
+
+ ;; Define what are 'thing' for treesit.
+ ;; 'Thing' is a symbol representing the thing, like `defun', `sexp', or
+ ;; `sentence'.
+ (setq-local treesit-thing-settings
+ `((html
+ (sexp ,(regexp-opt '("element"
+ "text"
+ "attribute"
+ "value")))
+ (sentence "tag")
+ (text ,(regexp-opt '("comment" "text"))))
+ (javascript
+ (sexp ,(js--regexp-opt-symbol js--treesit-sexp-nodes))
+ (sentence ,(js--regexp-opt-symbol js--treesit-sentence-nodes))
+ (text ,(js--regexp-opt-symbol '("comment"
+ "string_fragment"))))))
+
+ ;; Font-lock.
+
+ ;; In a multi-language scenario, font lock settings are usually a
+ ;; concatenation of language rules. As you can see, it is possible
+ ;; to extend/modify the default rule or use a different set of
+ ;; rules. See `php-ts-mode--custom-html-font-lock-settings' for more
+ ;; advanced usage.
+ (setq-local treesit-font-lock-settings
+ (append html-ts-mode--font-lock-settings
+ js--treesit-font-lock-settings
+ (append
+ ;; Rule for coloring CSS property values.
+ ;; Placed before `css--treesit-settings'
+ ;; to win against the same rule contained therein.
+ (treesit-font-lock-rules
+ :language 'css
+ :override t
+ :feature 'variable
+ '((plain_value) @mhtml-ts-mode--colorize-css-value))
+ css--treesit-settings)))
+
+ ;; Tells treesit the list of features to fontify.
+ (setq-local treesit-font-lock-feature-list mhtml-ts-mode--feature-list)
+
+ ;; Imenu
+
+ ;; Setup Imenu: if no function is specified, try to find an object
+ ;; using `treesit-defun-name-function'.
+ ;; TODO: we need to see if it is possible to extend Imenu to
+ ;; embedded languages as well.
+ (setq-local treesit-simple-imenu-settings
+ `(("Element" "\\`tag_name\\'" nil nil)))
+
+ ;; This should be the last thing to do.
+ ;; Treesit tries to find out what the primary language is, but it is better
+ ;; to say it explicitly.
Correction: multi-language modes must set the primary parser
explicitly, the auto-guessing trick only works for single-language
modes. You can also move this line next to where you created the other
parsers, for better readability.
+ (setq-local treesit-primary-parser (treesit-parser-create 'html))
+ (treesit-font-lock-recompute-features)
You don't need to call treesit-font-lock-recompute-features,
treesit-major-mode-setup will do that for you.
+ (treesit-major-mode-setup)))
+
+(when (and (treesit-ready-p 'html) (treesit-ready-p 'javascript) (treesit-ready-p 'css))
+ (add-to-list
+ 'auto-mode-alist '("\\.[sx]?html?\\(\\.[a-zA-Z_]+\\)?\\'" . mhtml-ts-mode)))
+
+(provide 'mhtml-ts-mode)
+;;; mhtml-ts-mode.el ends here
--
2.47.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-01 6:01 ` Yuan Fu
@ 2024-12-01 8:00 ` Eli Zaretskii
2024-12-01 8:18 ` Yuan Fu
2024-12-03 14:29 ` Vincenzo Pupillo
1 sibling, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2024-12-01 8:00 UTC (permalink / raw)
To: Yuan Fu; +Cc: v.pupillo, 74610
> Cc: 74610@debbugs.gnu.org
> From: Yuan Fu <casouri@gmail.com>
> Date: Sat, 30 Nov 2024 22:01:21 -0800
>
>
>
> > On Nov 29, 2024, at 1:57 PM, Vincenzo Pupillo <v.pupillo@gmail.com> wrote:
> >
> > Ciao,
> > following the discussion
> > https://lists.gnu.org/archive/html/emacs-devel/2024-11/msg00079.html I would
> > like to ask if it would be possible to add to emacs this new mode for editing
> > html files alternative to mhtml-mode.
> >
> > Thank you.
> >
> > Vincenzo.<0001-Add-mhtml-ts-mode.patch>
>
> Thank you so much! This will be very helpful for others. Here’re some comments.
Yuan, I don't see any comments, only what I think is the original
patch.
^ permalink raw reply [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-01 8:00 ` Eli Zaretskii
@ 2024-12-01 8:18 ` Yuan Fu
0 siblings, 0 replies; 13+ messages in thread
From: Yuan Fu @ 2024-12-01 8:18 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: v.pupillo, 74610
> On Dec 1, 2024, at 12:00 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> Cc: 74610@debbugs.gnu.org
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Sat, 30 Nov 2024 22:01:21 -0800
>>
>>
>>
>>> On Nov 29, 2024, at 1:57 PM, Vincenzo Pupillo <v.pupillo@gmail.com> wrote:
>>>
>>> Ciao,
>>> following the discussion
>>> https://lists.gnu.org/archive/html/emacs-devel/2024-11/msg00079.html I would
>>> like to ask if it would be possible to add to emacs this new mode for editing
>>> html files alternative to mhtml-mode.
>>>
>>> Thank you.
>>>
>>> Vincenzo.<0001-Add-mhtml-ts-mode.patch>
>>
>> Thank you so much! This will be very helpful for others. Here’re some comments.
>
> Yuan, I don't see any comments, only what I think is the original
> patch.
Ah, I guess I’m supposed to quote the original patch when making comments :-( The lines that aren’t prefixed by + are my comments.
Yuan
^ permalink raw reply [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-01 6:01 ` Yuan Fu
2024-12-01 8:00 ` Eli Zaretskii
@ 2024-12-03 14:29 ` Vincenzo Pupillo
2024-12-11 4:54 ` Yuan Fu
1 sibling, 1 reply; 13+ messages in thread
From: Vincenzo Pupillo @ 2024-12-03 14:29 UTC (permalink / raw)
To: Yuan Fu; +Cc: Eli Zaretskii, 74610
In data domenica 1 dicembre 2024 07:01:21 Ora standard dell’Europa centrale,
Yuan Fu ha scritto:
> It's not uncommon to see different indent offset for CSS and
> Javascript, so it's a good idea to have separate control for them.
Is the behavior the same as mhtml-mode, or would you like something like this?
<style>
z {
color: red;
}
</style>
<script>
function myFunction(p1, p2) {
return p1 * p2;
}
</script>
The mhtml-ts-mode-js-css-indent-offset variable controls only the indentation
relative to the <style> and <script> tags.
V.
^ permalink raw reply [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-11-29 21:57 bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode Vincenzo Pupillo
2024-12-01 6:01 ` Yuan Fu
@ 2024-12-04 1:27 ` Dmitry Gutov
2024-12-04 10:47 ` Vincenzo Pupillo
1 sibling, 1 reply; 13+ messages in thread
From: Dmitry Gutov @ 2024-12-04 1:27 UTC (permalink / raw)
To: Vincenzo Pupillo, 74610
On 29/11/2024 23:57, Vincenzo Pupillo wrote:
> +;; This package provides `mhtml-ts-mode' which is a major mode
> +;; for editing HTML files with embedded JavaScript and CSS.
> +;; Tree Sitter is used to parse each of these languages.
> +;;
> +;; Please note that this package requires `html-ts-mode', which
> +;; registers itself as the major mode for editing HTML.
Hi!
Do you foresee cases for when html-ts-mode would be preferred by the
user instead of this advanced mhtml-ts-mode? Or maybe the former is
better in its current shape when used by e.g. php-ts-mode?
In other words, I'm wondering why not update the existing mode with
sub-parsers rather than add a new one. html-mode had such a reason -
it's quite old, and has been used in various placed the way it is now
(including multi-mode packages). But ts modes don't work too well with
multi-mode packages, not currently anyway.
^ permalink raw reply [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-04 1:27 ` Dmitry Gutov
@ 2024-12-04 10:47 ` Vincenzo Pupillo
2024-12-05 16:51 ` Dmitry Gutov
0 siblings, 1 reply; 13+ messages in thread
From: Vincenzo Pupillo @ 2024-12-04 10:47 UTC (permalink / raw)
To: Dmitry Gutov, 74610
Hi Dmitry,
In data mercoledì 4 dicembre 2024 02:27:53 Ora standard dell’Europa centrale,
hai scritto:
> On 29/11/2024 23:57, Vincenzo Pupillo wrote:
> > +;; This package provides `mhtml-ts-mode' which is a major mode
> > +;; for editing HTML files with embedded JavaScript and CSS.
> > +;; Tree Sitter is used to parse each of these languages.
> > +;;
> > +;; Please note that this package requires `html-ts-mode', which
> > +;; registers itself as the major mode for editing HTML.
>
> Hi!
>
> Do you foresee cases for when html-ts-mode would be preferred by the
> user instead of this advanced mhtml-ts-mode?
For everyday use mhtml-ts-mode is better, just like mhtml-mode (which has been
the default for html editing for a while now).
> Or maybe the former is
> better in its current shape when used by e.g. php-ts-mode?
Yes, personally I think that major modes that handle (for tree-sitters) only
one language are easier to put together at the moment. It's Lego vs.
Playmobil.
We are in an experimental phase, like all other editors.
See https://github.com/helix-editor/helix/pull/1170#issuecomment-997294090
In some ways, by having a different approach from other editors, we have a
greater degree of flexibility IMHO.
>
> In other words, I'm wondering why not update the existing mode with
> sub-parsers rather than add a new one. html-mode had such a reason -
> it's quite old, and has been used in various placed the way it is now
> (including multi-mode packages). But ts modes don't work too well with
> multi-mode packages, not currently anyway.
It's something I've thought about but haven't tried yet.
One of the themes of the email thread (on emacs-devel) was to have a simple
multi language major mode that was also a sort of “user's guide.”
Vincenzo.
^ permalink raw reply [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-04 10:47 ` Vincenzo Pupillo
@ 2024-12-05 16:51 ` Dmitry Gutov
2024-12-06 13:39 ` Vincenzo Pupillo
0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Gutov @ 2024-12-05 16:51 UTC (permalink / raw)
To: Vincenzo Pupillo, 74610
Hi Vincenzo,
On 04/12/2024 12:47, Vincenzo Pupillo wrote:
>> Do you foresee cases for when html-ts-mode would be preferred by the
>> user instead of this advanced mhtml-ts-mode?
> For everyday use mhtml-ts-mode is better, just like mhtml-mode (which has been
> the default for html editing for a while now).
>
>> Or maybe the former is
>> better in its current shape when used by e.g. php-ts-mode?
> Yes, personally I think that major modes that handle (for tree-sitters) only
> one language are easier to put together at the moment. It's Lego vs.
> Playmobil.
> We are in an experimental phase, like all other editors.
> See https://github.com/helix-editor/helix/pull/1170#issuecomment-997294090
> In some ways, by having a different approach from other editors, we have a
> greater degree of flexibility IMHO.
Makes sense, thanks.
>> In other words, I'm wondering why not update the existing mode with
>> sub-parsers rather than add a new one. html-mode had such a reason -
>> it's quite old, and has been used in various placed the way it is now
>> (including multi-mode packages). But ts modes don't work too well with
>> multi-mode packages, not currently anyway.
>
> It's something I've thought about but haven't tried yet.
> One of the themes of the email thread (on emacs-devel) was to have a simple
> multi language major mode that was also a sort of “user's guide.”
I though the updated html-ts-mode could be that mode. Anyway, good to
hear that this alternative had been given consideration.
^ permalink raw reply [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-05 16:51 ` Dmitry Gutov
@ 2024-12-06 13:39 ` Vincenzo Pupillo
0 siblings, 0 replies; 13+ messages in thread
From: Vincenzo Pupillo @ 2024-12-06 13:39 UTC (permalink / raw)
To: 74610, Dmitry Gutov
Ciao Dmitry,
In data giovedì 5 dicembre 2024 17:51:55 Ora standard dell’Europa centrale,
Dmitry Gutov ha scritto:
> Hi Vincenzo,
>
> On 04/12/2024 12:47, Vincenzo Pupillo wrote:
> >> Do you foresee cases for when html-ts-mode would be preferred by the
> >> user instead of this advanced mhtml-ts-mode?
> >
> > For everyday use mhtml-ts-mode is better, just like mhtml-mode (which has
> > been the default for html editing for a while now).
> >
> >> Or maybe the former is
> >> better in its current shape when used by e.g. php-ts-mode?
> >
> > Yes, personally I think that major modes that handle (for tree-sitters)
> > only one language are easier to put together at the moment. It's Lego vs.
> > Playmobil.
> > We are in an experimental phase, like all other editors.
> > See https://github.com/helix-editor/helix/pull/1170#issuecomment-997294090
> > In some ways, by having a different approach from other editors, we have a
> > greater degree of flexibility IMHO.
>
> Makes sense, thanks.
>
> >> In other words, I'm wondering why not update the existing mode with
> >> sub-parsers rather than add a new one. html-mode had such a reason -
> >> it's quite old, and has been used in various placed the way it is now
> >> (including multi-mode packages). But ts modes don't work too well with
> >> multi-mode packages, not currently anyway.
> >
> > It's something I've thought about but haven't tried yet.
> > One of the themes of the email thread (on emacs-devel) was to have a
> > simple
> > multi language major mode that was also a sort of “user's guide.”
>
> I though the updated html-ts-mode could be that mode. Anyway, good to
> hear that this alternative had been given consideration.
Since the development of Emacs 31 has just started and this major mode is not
urgent, I will try some experiments in the next few days to see if html-ts-
mode can be modified without compromising the integration with php-ts-mode.
Vincenzo
^ permalink raw reply [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-03 14:29 ` Vincenzo Pupillo
@ 2024-12-11 4:54 ` Yuan Fu
2024-12-14 10:37 ` Vincenzo Pupillo
0 siblings, 1 reply; 13+ messages in thread
From: Yuan Fu @ 2024-12-11 4:54 UTC (permalink / raw)
To: Vincenzo Pupillo; +Cc: Eli Zaretskii, 74610
> On Dec 3, 2024, at 6:29 AM, Vincenzo Pupillo <v.pupillo@gmail.com> wrote:
>
> In data domenica 1 dicembre 2024 07:01:21 Ora standard dell’Europa centrale,
> Yuan Fu ha scritto:
>> It's not uncommon to see different indent offset for CSS and
>> Javascript, so it's a good idea to have separate control for them.
>
> Is the behavior the same as mhtml-mode, or would you like something like this?
>
> <style>
> z {
> color: red;
> }
> </style>
> <script>
> function myFunction(p1, p2) {
> return p1 * p2;
> }
> </script>
>
> The mhtml-ts-mode-js-css-indent-offset variable controls only the indentation
> relative to the <style> and <script> tags.
Ah, I see, it’s the offset from the enclosing tag. In that case it should be fine to use a common variable.
Yuan
^ permalink raw reply [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-11 4:54 ` Yuan Fu
@ 2024-12-14 10:37 ` Vincenzo Pupillo
2024-12-16 17:37 ` Juri Linkov
0 siblings, 1 reply; 13+ messages in thread
From: Vincenzo Pupillo @ 2024-12-14 10:37 UTC (permalink / raw)
To: Yuan Fu; +Cc: Eli Zaretskii, 74610
[-- Attachment #1: Type: text/plain, Size: 1529 bytes --]
In data mercoledì 11 dicembre 2024 05:54:09 Ora standard dell’Europa centrale,
Yuan Fu ha scritto:
> > On Dec 3, 2024, at 6:29 AM, Vincenzo Pupillo <v.pupillo@gmail.com> wrote:
> >
> > In data domenica 1 dicembre 2024 07:01:21 Ora standard dell’Europa
> > centrale,>
> > Yuan Fu ha scritto:
> >> It's not uncommon to see different indent offset for CSS and
> >> Javascript, so it's a good idea to have separate control for them.
> >
> > Is the behavior the same as mhtml-mode, or would you like something like
> > this?>
> > <style>
> >
> > z {
> >
> > color: red;
> >
> > }
> >
> > </style>
> > <script>
> >
> > function myFunction(p1, p2) {
> >
> > return p1 * p2;
> >
> > }
> >
> > </script>
> >
> > The mhtml-ts-mode-js-css-indent-offset variable controls only the
> > indentation relative to the <style> and <script> tags.
>
> Ah, I see, it’s the offset from the enclosing tag. In that case it should be
> fine to use a common variable.
>
> Yuan
Thank you Yuan.
Attached is the revised patch following your previous comments.
As I already wrote to Dmitry, I am doing some tests to see if html-ts-mode can
be extended and if there is a way to integrate one multi-language mode into
another multi-language mode.
Vincenzo
[-- Attachment #2: 0001-Add-mhtml-ts-mode.patch --]
[-- Type: text/x-patch, Size: 19855 bytes --]
From 355075793eff5a58dac83756d96881b6932d5838 Mon Sep 17 00:00:00 2001
From: Vincenzo Pupillo <v.pupillo@gmail.com>
Date: Fri, 29 Nov 2024 22:48:45 +0100
Subject: [PATCH] Add mhtml-ts-mode.
New major-mode alternative to mhtml-mode, based on treesitter, for
editing files containing html, javascript and css.
* etc/NEWS: Mention the new mode.
* lisp/textmodes/mhtml-ts-mode.el: New file.
---
etc/NEWS | 8 +
lisp/textmodes/mhtml-ts-mode.el | 429 ++++++++++++++++++++++++++++++++
2 files changed, 437 insertions(+)
create mode 100644 lisp/textmodes/mhtml-ts-mode.el
diff --git a/etc/NEWS b/etc/NEWS
index 4d2a2c893d0..8f9a04dcf01 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -797,6 +797,14 @@ destination window is chosen using 'display-buffer-alist'. Example:
\f
* New Modes and Packages in Emacs 31.1
+** New major modes based on the tree-sitter library
+
++++
+*** New major mode 'mhtml-ts-mode'.
+An optional major mode based on the tree-sitter library for editing html
+files. This mode handles indentation, fontification, and commenting for
+embedded JavaScript and CSS.
+
\f
* Incompatible Lisp Changes in Emacs 31.1
diff --git a/lisp/textmodes/mhtml-ts-mode.el b/lisp/textmodes/mhtml-ts-mode.el
new file mode 100644
index 00000000000..746300efc33
--- /dev/null
+++ b/lisp/textmodes/mhtml-ts-mode.el
@@ -0,0 +1,429 @@
+;;; mhtml-ts-mode.el --- Major mode for HTML using tree-sitter -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2024 Free Software Foundation, Inc.
+
+;; Author: Vincenzo Pupillo <v.pupillo@gmail.com>
+;; Maintainer: Vincenzo Pupillo <v.pupillo@gmail.com>
+;; Created: Nov 2024
+;; Keywords: HTML language tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+;;
+;; This package provides `mhtml-ts-mode' which is a major mode
+;; for editing HTML files with embedded JavaScript and CSS.
+;; Tree Sitter is used to parse each of these languages.
+;;
+;; Please note that this package requires `html-ts-mode', which
+;; registers itself as the major mode for editing HTML.
+;;
+;; This package is compatible and has been tested with the following
+;; tree-sitter grammars:
+;; * https://github.com/tree-sitter/tree-sitter-html
+;; * https://github.com/tree-sitter/tree-sitter-javascript
+;; * https://github.com/tree-sitter/tree-sitter-jsdoc
+;; * https://github.com/tree-sitter/tree-sitter-css
+;;
+;; Features
+;;
+;; * Indent
+;; * IMenu
+;; * Navigation
+;; * Which-function
+;; * Tree-sitter parser installation helper
+
+;;; Code:
+
+(require 'treesit)
+(require 'html-ts-mode)
+(require 'css-mode) ;; for embed css into html
+(require 'js) ;; for embed javascript into html
+
+(eval-when-compile
+ (require 'rx))
+
+;; This tells the byte-compiler where the functions are defined.
+;; Is only needed when a file needs to be able to byte-compile
+;; in a Emacs not built with tree-sitter library.
+(treesit-declare-unavailable-functions)
+
+;; In a multi-language major mode can be useful to have an "installer" to
+;; simplify the installation of the grammars supported by the major-mode.
+(defvar mhtml-ts-mode--language-source-alist
+ '((html . ("https://github.com/tree-sitter/tree-sitter-html" "v0.23.0"))
+ (javascript . ("https://github.com/tree-sitter/tree-sitter-javascript" "v0.23.0"))
+ (jsdoc . ("https://github.com/tree-sitter/tree-sitter-jsdoc" "v0.23.0"))
+ (css . ("https://github.com/tree-sitter/tree-sitter-css" "v0.23.0")))
+ "Treesitter language parsers required by `mhtml-ts-mode'.
+You can customize this variable if you want to stick to a specific
+commit and/or use different parsers.")
+
+(defun mhtml-ts-mode-install-parsers ()
+ "Install all the required treesitter parsers.
+`mhtml-ts-mode--language-source-alist' defines which parsers to install."
+ (interactive)
+ (let ((treesit-language-source-alist mhtml-ts-mode--language-source-alist))
+ (dolist (item mhtml-ts-mode--language-source-alist)
+ (treesit-install-language-grammar (car item)))))
+
+;;; Custom variables
+
+(defgroup mhtml-ts-mode nil
+ "Major mode for editing HTML files, based on `html-ts-mode'.
+Works with JS and CSS and for that use `js-ts-mode' and `css-ts-mode'."
+ :prefix "mhtml-ts-mode-"
+ ;; :group 'languages
+ :group 'html)
+
+(defcustom mhtml-ts-mode-js-css-indent-offset 2
+ "JavaScript and CSS indent spaces related to the <script> and <style> HTML tags.
+By default should have same value as `html-ts-mode-indent-offset'."
+ :tag "HTML javascript or css indent offset"
+ :version "31.1"
+ :type 'integer
+ :safe 'integerp)
+
+(defvar mhtml-ts-mode--js-css-indent-offset
+ mhtml-ts-mode-js-css-indent-offset
+ "Internal copy of `mhtml-ts-mode-js-css-indent-offset'.
+The value changes, by `mhtml-ts-mode--tag-relative-indent-offset' according to
+the value of `mhtml-ts-mode-tag-relative-indent'.")
+
+(defun mhtml-ts-mode--tag-relative-indent-offset (sym val)
+ "Custom setter for `mhtml-ts-mode-tag-relative-indent'.
+
+Apart from setting the default value of SYM to VAL, also change the
+value of SYM in `mhtml-ts-mode' buffers to VAL. SYM should be
+`mhtml-ts-mode-tag-relative-indent', and VAL should be t, nil or
+`ignore'. When sym is `mhtml-ts-mode-tag-relative-indent' set the
+value of `mhtml-ts-mode--js-css-indent-offset' to 0 if VAL is t,
+otherwise to `mhtml-ts-mode-js-css-indent-offset'."
+ (set-default sym val)
+ (when (eq sym 'mhtml-ts-mode-tag-relative-indent)
+ (setq-local
+ mhtml-ts-mode--js-css-indent-offset
+ (if (eq val t)
+ mhtml-ts-mode-js-css-indent-offset
+ 0))))
+
+(defcustom mhtml-ts-mode-tag-relative-indent t
+ "How <script> and <style> bodies are indented relative to the tag.
+
+When t, indentation looks like:
+
+ <script>
+ code();
+ </script>
+
+When nil, indentation of the script body starts just below the
+tag, like:
+
+ <script>
+ code();
+ </script>
+
+When `ignore', the script body starts in the first column, like:
+
+ <script>
+code();
+ </script>"
+ :type '(choice (const nil) (const t) (const ignore))
+ :safe 'symbolp
+ :set #'mhtml-ts-mode--tag-relative-indent-offset
+ :version "31.1")
+
+(defcustom mhtml-ts-mode-css-fontify-colors t
+ "Whether CSS colors should be fontified using the color as the background.
+If non-nil, text representing a CSS color will be fontified
+such that its background is the color itself.
+Works like `css--fontify-region'."
+ :tag "HTML colors the CSS properties values."
+ :version "31.1"
+ :type 'boolean
+ :safe 'booleanp)
+
+;; To enable some basic treesiter functionality, you should define
+;; a function that recognizes which grammar is used at-point.
+;; This function should be assigned to `treesit-language-at-point-function'
+(defun mhtml-ts-mode--language-at-point (point)
+ "Return the language at POINT assuming the point is within a HTML buffer."
+ (let* ((node (treesit-node-at point 'html))
+ (parent (treesit-node-parent node))
+ (node-query (format "(%s (%s))"
+ (treesit-node-type parent)
+ (treesit-node-type node))))
+ (cond
+ ((string-equal "(script_element (raw_text))" node-query) 'javascript)
+ ((string-equal "(style_element (raw_text))" node-query) 'css)
+ (t 'html))))
+
+;; Custom font-lock function that's used to apply color to css color
+;; The signature of the function should be conforming to signature
+;; QUERY-SPEC required by `treesit-font-lock-rules'.
+(defun mhtml-ts-mode--colorize-css-value (node override start end &rest _)
+ "Colorize CSS property value like `css--fontify-region'.
+For NODE, OVERRIDE, START, and END, see `treesit-font-lock-rules'."
+ (if (and mhtml-ts-mode-css-fontify-colors
+ (string-equal "plain_value" (treesit-node-type node)))
+ (let ((color (css--compute-color start (treesit-node-text node t))))
+ (when color
+ (with-silent-modifications
+ (add-text-properties
+ (treesit-node-start node) (treesit-node-end node)
+ (list 'face (list :background color
+ :foreground (readable-foreground-color
+ color)
+ :box '(:line-width -1)))))))
+ (treesit-fontify-with-override
+ (treesit-node-start node) (treesit-node-end node)
+ 'font-lock-variable-name-face
+ override start end)))
+
+;; Embedded languages should be indented according to the language
+;; that embeds them.
+;; This function signature complies with `treesit-simple-indent-rules'
+;; ANCHOR.
+(defun mhtml-ts-mode--js-css-tag-bol (_node _parent &rest _)
+ "Find the first non-space characters of html tags <script> or <style>.
+Return `line-beginning-position' when `treesit-node-at' is html, or
+`mhtml-ts-mode-tag-relative-indent' is equal to ignore.
+NODE and PARENT are ignored."
+ (if (or (eq (treesit-language-at (point)) 'html)
+ (eq mhtml-ts-mode-tag-relative-indent 'ignore))
+ (line-beginning-position)
+ ;; Ok, we are in js or css block.
+ (save-excursion
+ (re-search-backward "<script.*>\\|<style.*>" nil t))))
+
+;; Treesit supports 4 level of decoration, `treesit-font-lock-level'
+;; define which level to use. Major modes categorize their fontification
+;; features, these categories are defined by `treesit-font-lock-rules' of
+;; each major-mode using :feature keyword.
+;; In a multiple language Major mode it's a good idea to provide, for each
+;; level, the union of the :feature of the same level.
+(defvar mhtml-ts-mode--feature-list
+ '(;; level 1
+ (;; common
+ comment definition
+ ;; JS specific
+ document
+ ;; CSS specific
+ query selector)
+ ;; level 2
+ (keyword name property string type)
+ ;; level 3
+ (;; common
+ attribute assignment constant escape-sequence
+ base-clause literal variable-name variable
+ ;; Javascript specific
+ jsx number pattern string-interpolation)
+ ;; level 4
+ (bracket delimiter error operator function)))
+
+;; In order to support `which-fuction-mode' we should define
+;; a function that return the defun name.
+;; In a multilingual treesit mode, this can be implemented simply by
+;; calling language-specific functions.
+(defun mhtml-ts-mode--defun-name (node)
+ "Return the defun name of NODE.
+Return nil if there is no name or if NODE is not a defun node."
+ (let ((lang (mhtml-ts-mode--language-at-point (point))))
+ (message "lang = %s" lang)
+ (cond
+ ((eq lang 'html) (html-ts-mode--defun-name node))
+ ((eq lang 'javascript) (js--treesit-defun-name node))
+ ((eq lang 'css) (css--treesit-defun-name node)))))
+
+(define-derived-mode mhtml-ts-mode html-mode
+ '("HTML+" (:eval (let ((lang (mhtml-ts-mode--language-at-point (point))))
+ (cond ((eq lang 'html) "")
+ ((eq lang 'javascript) "JS")
+ ((eq lang 'css) "CSS")))))
+ "Major mode for editing HTML with embedded JavaScript and CSS.
+Powered by tree-sitter."
+ (if (not (and
+ (treesit-ready-p 'html)
+ (treesit-ready-p 'javascript)
+ (treesit-ready-p 'css)))
+ (error "Tree-sitter parsers for HTML isn't
+ available. You can install the parsers with M-x
+ `mhtml-ts-mode-install-parsers'")
+
+ ;; When an language is embedded, you should initialize some variable
+ ;; just like it's done in the original mode.
+
+ ;; Comment.
+ ;; indenting settings for js-ts-mode.
+ (c-ts-common-comment-setup)
+ (setq-local comment-multi-line t)
+
+ ;; Font-lock.
+
+ ;; There are two ways to handle embedded code:
+ ;; 1. Use a single parser for all the embedded code in the buffer. In
+ ;; this case, the embedded code blocks are concatenated together and are
+ ;; seen as a single continuous document to the parser.
+ ;; 2. Each embedded code block gets its own parser. Each parser only sees
+ ;; that particular code block.
+
+ ;; If you go with 2 for a language, the local parsers are created and
+ ;; destroyed automatically by Emacs. So don't create a global parser for
+ ;; that embedded language here.
+
+ ;; Create the parsers, only the global ones.
+ ;; jsdoc is a local parser, don't create a parser for it.
+ (treesit-parser-create 'css)
+ (treesit-parser-create 'javascript)
+
+ ;; Multi-language modes must set the primary parser.
+ (setq-local treesit-primary-parser (treesit-parser-create 'html))
+
+ (setq-local treesit-range-settings
+ (treesit-range-rules
+ :embed 'javascript
+ :host 'html
+ :offset '(1 . -1)
+ '((script_element
+ (start_tag (tag_name))
+ (raw_text) @cap))
+
+ :embed 'css
+ :host 'html
+ :offset '(1 . -1)
+ '((style_element
+ (start_tag (tag_name))
+ (raw_text) @cap))))
+
+ ;; jsdoc is not mandatory for js-ts-mode, so we respect this by
+ ;; adding jsdoc range rules only when jsdoc is available.
+ (when (treesit-ready-p 'jsdoc t)
+ (setq-local treesit-range-settings
+ (append treesit-range-settings
+ (treesit-range-rules
+ :embed 'jsdoc
+ :host 'javascript
+ :local t
+ `(((comment) @cap
+ (:match ,js--treesit-jsdoc-beginning-regexp @cap))))))
+ (setq-local c-ts-common--comment-regexp
+ (rx (or "comment" "line_comment" "block_comment" "description"))))
+
+
+ ;; Many treesit fuctions need to know the language at-point.
+ ;; So you should define such a function.
+ (setq-local treesit-language-at-point-function #'mhtml-ts-mode--language-at-point)
+
+ ;; Indent.
+
+ ;; Since mhtl-ts-mode inherits indentation rules from html-ts-mode, js
+ ;; and css, if you want to change the offset you have to act on the
+ ;; *-offset variables defined for those languages.
+
+ ;; JavaScript and CSS must be indented relative to their code block.
+ ;; This is done by inserting a special rule before the normal
+ ;; indentation rules of these languages.
+ ;; The value of mhtml-ts-mode--js-css-indent-offset changes based on
+ ;; mhtml-ts-mode-tag-relative-indent and can be used to indent
+ ;; JavaScript and CSS code relative to the HTML that contains them,
+ ;; just like in mhtml-mode.
+ (setq-local treesit-simple-indent-rules
+ (append html-ts-mode--indent-rules
+ ;; Extended rules for js and css, to
+ ;; indent appropriately when injected
+ ;; into html
+ `((javascript ((parent-is "program")
+ mhtml-ts-mode--js-css-tag-bol
+ mhtml-ts-mode--js-css-indent-offset)
+ ,@(cdr (car js--treesit-indent-rules))))
+ `((css ((parent-is "stylesheet")
+ mhtml-ts-mode--js-css-tag-bol
+ mhtml-ts-mode--js-css-indent-offset)
+ ,@(cdr (car css--treesit-indent-rules))))))
+ ;; Navigation.
+
+ ;; This is for finding defun name, it's used by IMenu as default
+ ;; function no specific functions are defined.
+ (setq-local treesit-defun-name-function #'mhtml-ts-mode--defun-name)
+
+ ;; Define what are 'thing' for treesit.
+ ;; 'Thing' is a symbol representing the thing, like `defun', `sexp', or
+ ;; `sentence'.
+ ;; As an alternative, if you want just defun, you can define a `treesit-defun-type-regexp'.
+ (setq-local treesit-thing-settings
+ `((html
+ (defun "element")
+ (sexp ,(regexp-opt '("element"
+ "text"
+ "attribute"
+ "value")))
+ (sentence "tag")
+ (text ,(regexp-opt '("comment" "text"))))
+ (javascript
+ (defun ,(rx (or "class_declaration"
+ "method_definition"
+ "function_declaration"
+ "lexical_declaration")))
+ (sexp ,(regexp-opt js--treesit-sexp-nodes 'symbols))
+ (sentence ,(regexp-opt js--treesit-sentence-nodes 'symbols))
+ (text ,(regexp-opt
+ '("comment"
+ "string_fragment")
+ 'symbols)))
+ (css
+ (defun "rule_set"))))
+
+ ;; Font-lock.
+
+ ;; In a multi-language scenario, font lock settings are usually a
+ ;; concatenation of language rules. As you can see, it is possible
+ ;; to extend/modify the default rule or use a different set of
+ ;; rules. See `php-ts-mode--custom-html-font-lock-settings' for more
+ ;; advanced usage.
+ (setq-local treesit-font-lock-settings
+ (append html-ts-mode--font-lock-settings
+ js--treesit-font-lock-settings
+ (append
+ ;; Rule for coloring CSS property values.
+ ;; Placed before `css--treesit-settings'
+ ;; to win against the same rule contained therein.
+ (treesit-font-lock-rules
+ :language 'css
+ :override t
+ :feature 'variable
+ '((plain_value) @mhtml-ts-mode--colorize-css-value))
+ css--treesit-settings)))
+
+ ;; Tells treesit the list of features to fontify.
+ (setq-local treesit-font-lock-feature-list mhtml-ts-mode--feature-list)
+
+ ;; Imenu
+
+ ;; Setup Imenu: if no function is specified, try to find an object
+ ;; using `treesit-defun-name-function'.
+ ;; TODO: we need to see if it is possible to extend Imenu to
+ ;; embedded languages as well.
+ (setq-local treesit-simple-imenu-settings
+ `(("Element" "\\`tag_name\\'" nil nil)))
+
+ (treesit-major-mode-setup)))
+
+(when (and (treesit-ready-p 'html) (treesit-ready-p 'javascript) (treesit-ready-p 'css))
+ (add-to-list
+ 'auto-mode-alist '("\\.[sx]?html?\\(\\.[a-zA-Z_]+\\)?\\'" . mhtml-ts-mode)))
+
+(provide 'mhtml-ts-mode)
+;;; mhtml-ts-mode.el ends here
--
2.47.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-14 10:37 ` Vincenzo Pupillo
@ 2024-12-16 17:37 ` Juri Linkov
2024-12-17 21:25 ` Vincenzo Pupillo
0 siblings, 1 reply; 13+ messages in thread
From: Juri Linkov @ 2024-12-16 17:37 UTC (permalink / raw)
To: Vincenzo Pupillo; +Cc: Yuan Fu, Eli Zaretskii, 74610
Ciao Vincenzo,
> In data mercoledì 11 dicembre 2024 05:54:09 Ora standard dell’Europa centrale,
>
> Attached is the revised patch following your previous comments.
> As I already wrote to Dmitry, I am doing some tests to see if html-ts-mode can
> be extended and if there is a way to integrate one multi-language mode into
> another multi-language mode.
I'm testing your patch with mhtml-ts-mode, and everything works nicely.
At the same time I'm adding a new ts-thing named 'sexp-list' in bug#73404.
While 'sexp' defines both lists and atoms, 'sexp-list' defines only lists.
So I added (sexp-list ,(regexp-opt '("element")) 'symbols)
to treesit-thing-settings in html-ts-mode.el.
But then discovered surprisingly that it has no effect on mhtml-ts-mode.
The problem is that treesit-thing-settings should be duplicated
from html-ts-mode to mhtml-ts-mode.
On the one hand, integrating multi-language mode to the exiting mode
html-ts-mode could avoid the need to duplicate treesit-thing-settings
for html.
But on the other hand, integrating mhtml-ts-mode to html-ts-mode
doesn't help to avoid such duplication for other embedded modes.
Because I needed to duplicate treesit-thing-settings for javascript
as well.
So extending html-ts-mode doesn't help here, and maybe even better
to add mhtml-ts-mode to keep the symmetry with existing mhtml-mode
such as used in mode remapping:
(add-to-list 'major-mode-remap-alist '(mhtml-mode .
mhtml-ts-mode))
What could really help is to try to get settings from html-ts-mode
and js-ts-mode to avoid the need to duplicate settings in mhtml-ts-mode.
^ permalink raw reply [flat|nested] 13+ messages in thread
* bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode
2024-12-16 17:37 ` Juri Linkov
@ 2024-12-17 21:25 ` Vincenzo Pupillo
0 siblings, 0 replies; 13+ messages in thread
From: Vincenzo Pupillo @ 2024-12-17 21:25 UTC (permalink / raw)
To: Juri Linkov; +Cc: Yuan Fu, Eli Zaretskii
Ciao Juri,
In data lunedì 16 dicembre 2024 18:37:35 Ora standard dell’Europa centrale,
Juri Linkov ha scritto:
> Ciao Vincenzo,
>
> > In data mercoledì 11 dicembre 2024 05:54:09 Ora standard dell’Europa
> > centrale,
> >
> > Attached is the revised patch following your previous comments.
> > As I already wrote to Dmitry, I am doing some tests to see if html-ts-mode
> > can be extended and if there is a way to integrate one multi-language
> > mode into another multi-language mode.
>
> I'm testing your patch with mhtml-ts-mode, and everything works nicely.
>
Thanks
> At the same time I'm adding a new ts-thing named 'sexp-list' in bug#73404.
> While 'sexp' defines both lists and atoms, 'sexp-list' defines only lists.
>
> So I added (sexp-list ,(regexp-opt '("element")) 'symbols)
> to treesit-thing-settings in html-ts-mode.el.
>
> But then discovered surprisingly that it has no effect on mhtml-ts-mode.
>
> The problem is that treesit-thing-settings should be duplicated
> from html-ts-mode to mhtml-ts-mode.
>
> On the one hand, integrating multi-language mode to the exiting mode
> html-ts-mode could avoid the need to duplicate treesit-thing-settings
> for html.
>
> But on the other hand, integrating mhtml-ts-mode to html-ts-mode
> doesn't help to avoid such duplication for other embedded modes.
> Because I needed to duplicate treesit-thing-settings for javascript
> as well.
>
> So extending html-ts-mode doesn't help here, and maybe even better
> to add mhtml-ts-mode to keep the symmetry with existing mhtml-mode
> such as used in mode remapping:
>
> (add-to-list 'major-mode-remap-alist '(mhtml-mode .
> mhtml-ts-mode))
>
> What could really help is to try to get settings from html-ts-mode
> and js-ts-mode to avoid the need to duplicate settings in mhtml-ts-mode.
I think we need something like a generalized version of the
`mhtml--construct-submode' function. I'm doing some testing on that and hope
to have something decent after Christmas.
Vincenzo.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-12-17 21:25 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-29 21:57 bug#74610: 31.0.50; Submitting mhtml-ts-mode, treesitter alternative to mhtml-mode Vincenzo Pupillo
2024-12-01 6:01 ` Yuan Fu
2024-12-01 8:00 ` Eli Zaretskii
2024-12-01 8:18 ` Yuan Fu
2024-12-03 14:29 ` Vincenzo Pupillo
2024-12-11 4:54 ` Yuan Fu
2024-12-14 10:37 ` Vincenzo Pupillo
2024-12-16 17:37 ` Juri Linkov
2024-12-17 21:25 ` Vincenzo Pupillo
2024-12-04 1:27 ` Dmitry Gutov
2024-12-04 10:47 ` Vincenzo Pupillo
2024-12-05 16:51 ` Dmitry Gutov
2024-12-06 13:39 ` Vincenzo Pupillo
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.