From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Theodor Thornhill Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter integration on feature/tree-sitter Date: Sun, 08 May 2022 11:02:10 +0200 Message-ID: <87y1zciei5.fsf@thornhill.no> References: <93b1d7c7-65fa-4c14-adad-b0e224bbf01c@email.android.com> <83k0awwlx5.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="22185"; mail-complaints-to="usenet@ciao.gmane.io" Cc: casouri@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun May 08 11:03:44 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nncpL-0005aW-QD for ged-emacs-devel@m.gmane-mx.org; Sun, 08 May 2022 11:03:44 +0200 Original-Received: from localhost ([::1]:37618 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nncpK-0003iA-FQ for ged-emacs-devel@m.gmane-mx.org; Sun, 08 May 2022 05:03:42 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:51878) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nnco0-0002pL-K1 for emacs-devel@gnu.org; Sun, 08 May 2022 05:02:20 -0400 Original-Received: from out1.migadu.com ([2001:41d0:2:863f::]:18894) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nncnw-0004V6-Tc; Sun, 08 May 2022 05:02:20 -0400 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=thornhill.no; s=key1; t=1652000532; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=TumI7YiQbadC8MKJtLDXljXFjKEXDAaBCo/Jz9qZYe8=; b=isebkFFHpRT7PkV2T/txBEJLWdA7jHtAeBXqG5fbBqWKsYd8MEfzt+mRdqt81H/MntmQbp xeXoOCajFPUucWgpYCavbzPfMvSYp5ldKlFdObbSm7SdM9l4icTWA/OM+SIXKK0Py2zaQk JFH4X9tqrX42DfcBIKufP/VQtMAoX3W/3ZbjhkvGUeZmuOE596XT7KmVtXg3Kg1hTqcv2H o9Zpe4h3yTgW0CgbEb1PxcZHv9RBjseOnmBeFGPns5YajVl48QFtNaCIrV5t8R4gR1fnj+ ID++A6gvKt+oRXhvONIP6WD0OLLHb+H8Ihaq1uwIzid1gPegwDuOMJELumvXIw== In-Reply-To: <83k0awwlx5.fsf@gnu.org> X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: thornhill.no Received-SPF: pass client-ip=2001:41d0:2:863f::; envelope-from=theo@thornhill.no; helo=out1.migadu.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:289456 Archived-At: --=-=-= Content-Type: text/plain > > Yes, why not? > >> I'll just put it behind a "js-mode-use-treesit-p" defcustom or something like that? > > Something like that, yes. Ok, see the attached patch. This makes the normal js-mode support tree sitter. Some caveats. 1. You need to install the tree sitter parser. Use Yuans tree-sitter-module [1] project for this 2. Put the javascript grammar inside ~/.emacs.d/tree-sitter/ 3. That should be it. 4. No wait, you need to set 'js-use-treesit-p' to 't' for this to work :) This should yield decent indentation and syntax highlighting, and should be sufficient for daily usage, I believe. There are surely many things that can improve, such as the navigation. Now we only support beginning-of-defun when inside of functions, but this is easily extendible. However, I'm not completely sold on the best way to deal with that. Suggestions welcome here. Anyways. Please try it out and report what you think. This is just a quick "look how easy it is to implement things using tree sitter", but I think it is a good starting point. All the best, Theodor Thornhill --=-=-= Content-Type: text/x-diff Content-Disposition: attachment; filename=0001-Add-tree-sitter-functionality-to-js-mode.patch >From 1031bcf9af23d7c74af00f6132acc27756cc7721 Mon Sep 17 00:00:00 2001 From: Theodor Thornhill Date: Sun, 8 May 2022 10:52:56 +0200 Subject: [PATCH] Add tree sitter functionality to js-mode * lisp/progmodes/js.el (js-use-treesit-p): New defcustom to control whether to use tree sitter or not. (js-treesit-backward-up-list): Utility function to find the scope when no node is found. (js-treesit-indent-rules): Rules for the simple indent engine. (js-treesit-font-lock-settings-1): Queries for font locking. Only one level thus far. (js-treesit-move-to-node, js-treesit-beginning-of-defun) (js-treesit-end-of-defun): Utility functions to find a function from point. Only supports function thus far. (js-treesit-enable): Function to enable tree sitter functionality. (js-mode): Wrap the js-use-treesit-p defcustom around mode initialization so that we can choose the implementation to use. --- lisp/progmodes/js.el | 391 ++++++++++++++++++++++++++++++++++--------- 1 file changed, 311 insertions(+), 80 deletions(-) diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el index 9c1358e466..cc00f4a7e4 100644 --- a/lisp/progmodes/js.el +++ b/lisp/progmodes/js.el @@ -3404,6 +3404,235 @@ js-jsx--detect-after-change (c-lang-defconst c-paragraph-start js-mode "\\(@[[:alpha:]]+\\>\\|$\\)") +;;; Tree sitter integration +(defcustom js-use-treesit-p nil + "Use tree sitter for font locking, indentation and navigation" + :version "29.1" + :type 'boolean + :safe 'booleanp) + +(defun js-treesit-backward-up-list () + (lambda (node parent bol &rest _) + (save-excursion + (backward-up-list 1 nil t) + (goto-char + (treesit-node-start + (treesit-node-at (point) (point) 'javascript))) + (back-to-indentation) + (point)))) + +(defvar js-treesit-indent-rules + `((javascript + (no-node (js-treesit-backward-up-list) ,js-indent-level) + ((node-is "}") parent-bol 0) + ((node-is ")") parent-bol 0) + ((node-is "]") parent-bol 0) + ((node-is ">") parent-bol 0) + ((node-is ".") parent-bol ,js-indent-level) + ((parent-is "named_imports") parent-bol ,js-indent-level) + ((parent-is "statement_block") parent-bol ,js-indent-level) + ((parent-is "variable_declarator") parent-bol ,js-indent-level) + ((parent-is "arguments") parent-bol ,js-indent-level) + ((parent-is "array") parent-bol ,js-indent-level) + ((parent-is "formal_parameters") parent-bol ,js-indent-level) + ((parent-is "template_substitution") parent-bol ,js-indent-level) + ((parent-is "object_pattern") parent-bol ,js-indent-level) + ((parent-is "object") parent-bol ,js-indent-level) + ((parent-is "arrow_function") parent-bol ,js-indent-level) + ((parent-is "parenthesized_expression") parent-bol ,js-indent-level) + + ;; JSX + ((parent-is "jsx_opening_element") parent ,js-indent-level) + ((node-is "jsx_closing_element") parent 0) + ((node-is "jsx_text") parent ,js-indent-level) + ((parent-is "jsx_element") parent ,js-indent-level) + ;; TODO(Theo): This one is a little off. Meant to hit the dangling '/' in + ;; a jsx-element. But it is also division operator... + ((node-is "/") parent 0) + ((parent-is "jsx_self_closing_element") parent ,js-indent-level)))) + +(defvar js-treesit-font-lock-settings-1 + '((javascript + ( + ((identifier) @font-lock-constant-face + (:match "^[A-Z_][A-Z_\\d]*$" @font-lock-constant-face)) + + (new_expression + constructor: (identifier) @font-lock-type-face) + + (function + name: (identifier) @font-lock-function-name-face) + + (function_declaration + name: (identifier) @font-lock-function-name-face) + + (method_definition + name: (property_identifier) @font-lock-function-name-face) + + (variable_declarator + name: (identifier) @font-lock-function-name-face + value: [(function) (arrow_function)]) + + (variable_declarator + name: (array_pattern (identifier) (identifier) @font-lock-function-name-face) + value: (array (number) (function))) + + (assignment_expression + left: [(identifier) @font-lock-function-name-face + (member_expression property: (property_identifier) @font-lock-function-name-face)] + right: [(function) (arrow_function)]) + + (call_expression + function: [(identifier) @font-lock-function-name-face + (member_expression + property: (property_identifier) @font-lock-function-name-face)]) + + (variable_declarator + name: (identifier) @font-lock-variable-name-face) + + (assignment_expression + left: [(identifier) @font-lock-variable-name-face + (member_expression property: (property_identifier) @font-lock-variable-name-face)]) + + (for_in_statement + left: (identifier) @font-lock-variable-name-face) + + (arrow_function + parameter: (identifier) @font-lock-variable-name-face) + + (arrow_function + parameters: [(_ (identifier) @font-lock-variable-name-face) + (_ (_ (identifier) @font-lock-variable-name-face)) + (_ (_ (_ (identifier) @font-lock-variable-name-face)))]) + + + (pair key: (property_identifier) @font-lock-variable-name-face) + + (pair value: (identifier) @font-lock-variable-name-face) + + (pair + key: (property_identifier) @font-lock-function-name-face + value: [(function) (arrow_function)]) + + ((shorthand_property_identifier) @font-lock-variable-name-face) + + (pair_pattern key: (property_identifier) @font-lock-variable-name-face) + + ((shorthand_property_identifier_pattern) @font-lock-variable-name-face) + + (array_pattern (identifier) @font-lock-variable-name-face) + + (jsx_opening_element [(nested_identifier (identifier)) (identifier)] @font-lock-function-name-face) + (jsx_closing_element [(nested_identifier (identifier)) (identifier)] @font-lock-function-name-face) + (jsx_self_closing_element [(nested_identifier (identifier)) (identifier)] @font-lock-function-name-face) + (jsx_attribute (property_identifier) @font-lock-constant-face) + + [(this) (super)] @font-lock-keyword-face + + [(true) (false) (null)] @font-lock-constant-face + ;; (regex pattern: (regex_pattern)) + (number) @font-lock-constant-face + + (string) @font-lock-string-face + + ;; template strings need to be last in the file for embedded expressions + ;; to work properly + (template_string) @font-lock-string-face + + (template_substitution + "${" @font-lock-constant-face + (_) + "}" @font-lock-constant-face + ) + + ["as" + "async" + "await" + "break" + "case" + "catch" + "class" + "const" + "continue" + "debugger" + "default" + "delete" + "do" + "else" + "export" + "extends" + "finally" + "for" + "from" + "function" + "get" + "if" + "import" + "in" + "instanceof" + "let" + "new" + "of" + "return" + "set" + "static" + "switch" + "switch" + "target" + "throw" + "try" + "typeof" + "var" + "void" + "while" + "with" + "yield"] @font-lock-keyword-face + + (comment) @font-lock-comment-face + )))) + +(defun js-treesit-move-to-node (fn) + (when-let ((found-node (treesit-parent-until + (treesit-node-at (point) (point) 'javascript) + (lambda (parent) + (let ((parent-type (treesit-node-type parent))) + (or (equal "function_declaration" parent-type) + ;;; More declarations here + )))))) + (goto-char (funcall fn found-node)))) + +(defun js-treesit-beginning-of-defun (&optional arg) + (js-treesit-move-to-node #'treesit-node-start)) + +(defun js-treesit-end-of-defun (&optional arg) + (js-treesit-move-to-node #'treesit-node-end)) + + +(defun js-treesit-enable () + (unless (or (treesit-should-enable-p) + (treesit-language-available-p 'javascript)) + (error "Tree sitter isn't available")) + + ;; Comments + (setq-local comment-start "// ") + (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *") + (setq-local comment-end "") + + (treesit-get-parser-create 'javascript) + (setq-local treesit-simple-indent-rules js-treesit-indent-rules) + (setq-local indent-line-function #'treesit-indent) + (setq-local beginning-of-defun-function #'js-treesit-beginning-of-defun) + (setq-local end-of-defun-function #'js-treesit-end-of-defun) + + ;; This needs to be non-nil, because reasons + (unless font-lock-defaults + (setq font-lock-defaults '(nil t))) + + (setq-local treesit-font-lock-defaults + '((js-treesit-font-lock-settings-1))) + + (treesit-font-lock-enable)) + ;;; Main Function ;;;###autoload @@ -3411,86 +3640,88 @@ js-mode "Major mode for editing JavaScript." :group 'js ;; Ensure all CC Mode "lang variables" are set to valid values. - (c-init-language-vars js-mode) - (setq-local indent-line-function #'js-indent-line) - (setq-local beginning-of-defun-function #'js-beginning-of-defun) - (setq-local end-of-defun-function #'js-end-of-defun) - (setq-local open-paren-in-column-0-is-defun-start nil) - (setq-local font-lock-defaults - (list js--font-lock-keywords nil nil nil nil - '(font-lock-syntactic-face-function - . js-font-lock-syntactic-face-function))) - (setq-local syntax-propertize-function #'js-syntax-propertize) - (add-hook 'syntax-propertize-extend-region-functions - #'syntax-propertize-multiline 'append 'local) - (add-hook 'syntax-propertize-extend-region-functions - #'js--syntax-propertize-extend-region 'append 'local) - (setq-local prettify-symbols-alist js--prettify-symbols-alist) - - (setq-local parse-sexp-ignore-comments t) - (setq-local which-func-imenu-joiner-function #'js--which-func-joiner) - - ;; Comments - (setq-local comment-start "// ") - (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *") - (setq-local comment-end "") - (setq-local fill-paragraph-function #'js-fill-paragraph) - (setq-local normal-auto-fill-function #'js-do-auto-fill) - - ;; Parse cache - (add-hook 'before-change-functions #'js--flush-caches t t) - - ;; Frameworks - (js--update-quick-match-re) - - ;; Syntax extensions - (unless (js-jsx--detect-and-enable) - (add-hook 'after-change-functions #'js-jsx--detect-after-change nil t)) - (js-use-syntactic-mode-name) - - ;; Imenu - (setq imenu-case-fold-search nil) - (setq imenu-create-index-function #'js--imenu-create-index) - - ;; for filling, pretend we're cc-mode - (c-foreign-init-lit-pos-cache) - (add-hook 'before-change-functions #'c-foreign-truncate-lit-pos-cache nil t) - (setq-local comment-line-break-function #'c-indent-new-comment-line) - (setq-local comment-multi-line t) - (setq-local electric-indent-chars - (append "{}():;," electric-indent-chars)) ;FIXME: js2-mode adds "[]*". - (setq-local electric-layout-rules - '((?\; . after) (?\{ . after) (?\} . before))) - - (let ((c-buffer-is-cc-mode t)) - ;; FIXME: These are normally set by `c-basic-common-init'. Should - ;; we call it instead? (Bug#6071) - (make-local-variable 'paragraph-start) - (make-local-variable 'paragraph-separate) - (make-local-variable 'paragraph-ignore-fill-prefix) - (make-local-variable 'adaptive-fill-mode) - (make-local-variable 'adaptive-fill-regexp) - ;; While the full CC Mode style system is not yet in use, set the - ;; pertinent style variables manually. - (c-initialize-builtin-style) - (let ((style (cc-choose-style-for-mode 'js-mode c-default-style))) - (c-set-style style)) - (setq c-block-comment-prefix "* " - c-comment-prefix-regexp "//+\\|\\**") - (c-setup-paragraph-variables)) - - ;; Important to fontify the whole buffer syntactically! If we don't, - ;; then we might have regular expression literals that aren't marked - ;; as strings, which will screw up parse-partial-sexp, scan-lists, - ;; etc. and produce maddening "unbalanced parenthesis" errors. - ;; When we attempt to find the error and scroll to the portion of - ;; the buffer containing the problem, JIT-lock will apply the - ;; correct syntax to the regular expression literal and the problem - ;; will mysteriously disappear. - ;; FIXME: We should instead do this fontification lazily by adding - ;; calls to syntax-propertize wherever it's really needed. - ;;(syntax-propertize (point-max)) - ) + (if js-use-treesit-p + (js-treesit-enable) + (c-init-language-vars js-mode) + (setq-local indent-line-function #'js-indent-line) + (setq-local beginning-of-defun-function #'js-beginning-of-defun) + (setq-local end-of-defun-function #'js-end-of-defun) + (setq-local open-paren-in-column-0-is-defun-start nil) + (setq-local font-lock-defaults + (list js--font-lock-keywords nil nil nil nil + '(font-lock-syntactic-face-function + . js-font-lock-syntactic-face-function))) + (setq-local syntax-propertize-function #'js-syntax-propertize) + (add-hook 'syntax-propertize-extend-region-functions + #'syntax-propertize-multiline 'append 'local) + (add-hook 'syntax-propertize-extend-region-functions + #'js--syntax-propertize-extend-region 'append 'local) + (setq-local prettify-symbols-alist js--prettify-symbols-alist) + + (setq-local parse-sexp-ignore-comments t) + (setq-local which-func-imenu-joiner-function #'js--which-func-joiner) + + ;; Comments + (setq-local comment-start "// ") + (setq-local comment-start-skip "\\(?://+\\|/\\*+\\)\\s *") + (setq-local comment-end "") + (setq-local fill-paragraph-function #'js-fill-paragraph) + (setq-local normal-auto-fill-function #'js-do-auto-fill) + + ;; Parse cache + (add-hook 'before-change-functions #'js--flush-caches t t) + + ;; Frameworks + (js--update-quick-match-re) + + ;; Syntax extensions + (unless (js-jsx--detect-and-enable) + (add-hook 'after-change-functions #'js-jsx--detect-after-change nil t)) + (js-use-syntactic-mode-name) + + ;; Imenu + (setq imenu-case-fold-search nil) + (setq imenu-create-index-function #'js--imenu-create-index) + + ;; for filling, pretend we're cc-mode + (c-foreign-init-lit-pos-cache) + (add-hook 'before-change-functions #'c-foreign-truncate-lit-pos-cache nil t) + (setq-local comment-line-break-function #'c-indent-new-comment-line) + (setq-local comment-multi-line t) + (setq-local electric-indent-chars + (append "{}():;," electric-indent-chars)) ;FIXME: js2-mode adds "[]*". + (setq-local electric-layout-rules + '((?\; . after) (?\{ . after) (?\} . before))) + + (let ((c-buffer-is-cc-mode t)) + ;; FIXME: These are normally set by `c-basic-common-init'. Should + ;; we call it instead? (Bug#6071) + (make-local-variable 'paragraph-start) + (make-local-variable 'paragraph-separate) + (make-local-variable 'paragraph-ignore-fill-prefix) + (make-local-variable 'adaptive-fill-mode) + (make-local-variable 'adaptive-fill-regexp) + ;; While the full CC Mode style system is not yet in use, set the + ;; pertinent style variables manually. + (c-initialize-builtin-style) + (let ((style (cc-choose-style-for-mode 'js-mode c-default-style))) + (c-set-style style)) + (setq c-block-comment-prefix "* " + c-comment-prefix-regexp "//+\\|\\**") + (c-setup-paragraph-variables)) + + ;; Important to fontify the whole buffer syntactically! If we don't, + ;; then we might have regular expression literals that aren't marked + ;; as strings, which will screw up parse-partial-sexp, scan-lists, + ;; etc. and produce maddening "unbalanced parenthesis" errors. + ;; When we attempt to find the error and scroll to the portion of + ;; the buffer containing the problem, JIT-lock will apply the + ;; correct syntax to the regular expression literal and the problem + ;; will mysteriously disappear. + ;; FIXME: We should instead do this fontification lazily by adding + ;; calls to syntax-propertize wherever it's really needed. + ;;(syntax-propertize (point-max)) + )) ;; Since we made JSX support available and automatically-enabled in ;; the base `js-mode' (for ease of use), now `js-jsx-mode' simply -- 2.25.1 --=-=-=--