From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Theodor Thornhill Newsgroups: gmane.emacs.devel Subject: Re: ruby-ts-mode.el -- first draft Date: Sun, 11 Dec 2022 09:22:28 +0100 Message-ID: <87v8mixscb.fsf@thornhill.no> References: <065A1DE9-B9BA-4AA3-9D59-D0F5547B8824@easesoftware.com> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5666"; mail-complaints-to="usenet@ciao.gmane.io" To: Perry Smith , emacs-devel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Dec 11 09:23:27 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p4HcM-0001Ev-6w for ged-emacs-devel@m.gmane-mx.org; Sun, 11 Dec 2022 09:23:26 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1p4Hbd-0004eJ-5w; Sun, 11 Dec 2022 03:22:41 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p4Hba-0004dt-Qi for emacs-devel@gnu.org; Sun, 11 Dec 2022 03:22:39 -0500 Original-Received: from out-56.mta0.migadu.com ([91.218.175.56]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p4HbX-0000Qz-Lp for emacs-devel@gnu.org; Sun, 11 Dec 2022 03:22:38 -0500 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=thornhill.no; s=key1; t=1670746951; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=aLXAeNx9iHTkm0x9Pm32bl5upeRi06Dv8TrA6psEJ0s=; b=ihZ6Fs1/B18ZkcpnNEdVtys+shuXXHmvrqlIUXxBDM4xB5xSk98UaOSSDz6YCAWB8y/4qu NXeXzVra8cUsh6zXIZf3+YwpJNDEQd9vf1IgsEXIj0ZmJjr1pSK1db5vl9+OEv6WL52Opd U7BuR7CX23gxsEBUq4abikScw+yplgDo+i99x37PYLhJbWP/VRcFQm4KOJLHZgyUY4/Gt+ IRKhCYrG3eur2FIarGvrY2+eRa9x3ab0Jya/0qu0P5kO5rw1yMAGFAJxOi5CfMio61Y/YZ nCpfPe6T/oPG78fvoAk+VaLgRlgZG0JUbezB59+jwDwEn4yxl4U7SbRE2eBA5g== In-Reply-To: <065A1DE9-B9BA-4AA3-9D59-D0F5547B8824@easesoftware.com> X-Migadu-Flow: FLOW_OUT Received-SPF: pass client-ip=91.218.175.56; envelope-from=theo@thornhill.no; helo=out-56.mta0.migadu.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:301146 Archived-At: Perry Smith writes: > Ruby is a versatile language and I fear that I may have missed wide swaths of its features. So, here is the first pass. I hope folks can play with it and find the bugs. > > Tree sitter is so versatile that for fortification, it is practically endless the features you could add. > > I have a git repository here: https://github.com/pedz/ruby-ts-mode > > And here inline is the file: > [...] > (defcustom ruby-ts-mode-indent-style 'base > "Style used for indentation. > > The selected style could be one of Ruby. If one of the supplied > styles doesn't suffice a function could be set instead. This > function is expected return a list that follows the form of > `treesit-simple-indent-rules'." > :version "29.1" > :type '(choice (symbol :tag "Base" 'base) > (function :tag "A function for user customized style" ignore)) > :group 'ruby) I believe we decided against using this indent style technique unless we had specific styles to show. A user could just: (add-hook 'ruby-mode-hook (lambda () (setq treesit-simple-indent-rules my-personal-ruby-indent-rules))) to override the current default anyway. > > (defcustom ruby-ts-mode-indent-style 'gnu > "Style used for indentation. > > Currently can only be set to BASE. If one of the supplied styles > doesn't suffice a function could be set instead. This function > is expected return a list that follows the form of > `treesit-simple-indent-rules'." > :version "29.1" > :type '(choice (symbol :tag "Base" 'base) > (function :tag "A function for user customized style" ignore)) > :group 'ruby) This can be removed. > > (defface ruby-ts-mode--constant-assignment-face > '((((class grayscale) (background light)) :foreground "DimGray" :slant italic) > (((class grayscale) (background dark)) :foreground "LightGray" :slant italic) > (((class color) (min-colors 88) (background light)) :foreground "VioletRed4") > (((class color) (min-colors 88) (background dark)) :foreground "plum2") > (((class color) (min-colors 16) (background light)) :foreground "RosyBrown") > (((class color) (min-colors 16) (background dark)) :foreground "LightSalmon") > (((class color) (min-colors 8)) :foreground "green") > (t :slant italic)) > "Font Lock mode face used in ruby-ts-mode to highlight assignments to constants." > :group 'font-lock-faces) > > (defface ruby-ts-mode--assignment-face > '((((class grayscale) (background light)) :foreground "DimGray" :slant italic) > (((class grayscale) (background dark)) :foreground "LightGray" :slant italic) > (((class color) (min-colors 88) (background light)) :foreground "VioletRed4") > (((class color) (min-colors 88) (background dark)) :foreground "coral1") > (((class color) (min-colors 16) (background light)) :foreground "RosyBrown") > (((class color) (min-colors 16) (background dark)) :foreground "LightSalmon") > (((class color) (min-colors 8)) :foreground "green") > (t :slant italic)) > "Font Lock mode face used in ruby-ts-mode to hightlight assignments." > :group 'font-lock-faces) Are you sure we need these very specific faces? Can't we reuse any of the provided ones? > ;; doc/keywords.rdoc in the Ruby git repository considers these to be > ;; reserved keywords. If these keywords are added to the list, it > ;; causes the font-lock to stop working. > ;; > ;; "__ENCODING__" "__FILE__" "__LINE__" "false" "self" "super" "true" > ;; > ;; "nil" (which does not exhibit this issue) is also considered a > ;; keyword but I removed it and added it as a constant. > ;; > (defun ruby-ts-mode--keywords (language) > "Ruby keywords for tree-sitter font-locking. > Currently LANGUAGE is ignored but shoule be set to `ruby'." > (let ((common-keywords > '("BEGIN" "END" "alias" "and" "begin" "break" "case" "class" > "def" "defined?" "do" "else" "elsif" "end" "ensure" "for" > "if" "in" "module" "next" "not" "or" "redo" "rescue" > "retry" "return" "then" "undef" "unless" "until" "when" > "while" "yield"))) > common-keywords)) > > ;; Ideas of what could be added: > ;; 1. The regular expressions start, end, and content could be font > ;; locked. Ditto for the command strings `foo`. The symbols > ;; inside a %s, %i, and %I could be given the "symbol" font. > ;; etc. > (defun ruby-ts-mode--font-lock-settings (language) > "Tree-sitter font-lock settings. > Currently LANGUAGE is ignored but should be set to `ruby'." > (treesit-font-lock-rules > :language language > :feature 'comment > `((comment) @font-lock-comment-face > (comment) @contextual) > > :language language > :feature 'keyword > `([,@(ruby-ts-mode--keywords language)] @font-lock-keyword-face) > > :language language > :feature 'constant > `((true) @font-lock-constant-face > (false) @font-lock-constant-face > (nil) @font-lock-constant-face > (self) @font-lock-constant-face > (super) @font-lock-constant-face) > > ;; Before 'operator so (unary) works. (I didn't want to try > ;; :override) > :language language > :feature 'literal > `((unary ["+" "-"] [(integer) (rational) (float) (complex)]) @font-lock-number-face > (simple_symbol) @font-lock-number-face > (delimited_symbol) @font-lock-number-face > (integer) @font-lock-number-face > (float) @font-lock-number-face > (complex) @font-lock-number-face > (rational) @font-lock-number-face) > > :language language > :feature 'operator > `("!" @font-lock-negation-char-face > [,@ruby-ts-mode--operators] @font-lock-operator-face) > > :language language > :feature 'string > `((string) @font-lock-string-face > (string_content) @font-lock-string-face) > > :language language > :feature 'type > `((constant) @font-lock-type-face) > > :language language > :feature 'assignment > '((assignment > left: (identifier) @ruby-ts-mode--assignment-face) > (assignment > left: (left_assignment_list (identifier) @ruby-ts-mode--assignment-face)) > (operator_assignment > left: (identifier) @ruby-ts-mode--assignment-face)) > > ;; Constant and scoped constant assignment (declaration) > ;; Must be enabled explicitly > :language language > :feature 'constant-assignment > :override t > `((assignment > left: (constant) @ruby-ts-mode--constant-assignment-face) > (assignment > left: (scope_resolution name: (constant) @ruby-ts-mode--constant-assignment-face))) > > :language language > :feature 'function > '((call > method: (identifier) @font-lock-function-name-face) > (method > name: (identifier) @font-lock-function-name-face)) > > :language language > :feature 'variable > '((identifier) @font-lock-variable-name-face) > > :language language > :feature 'error > '((ERROR) @font-lock-warning-face) > > :feature 'escape-sequence > :language language > :override t > '((escape_sequence) @font-lock-escape-face) > > :language language > :feature 'bracket > '((["(" ")" "[" "]" "{" "}"]) @font-lock-bracket-face) > ) > ) Tuck these end-parens up together with the third to last line. > > (defun ruby-ts-mode--indent-styles (language) > "Indent rules supported by `ruby-ts-mode'. > Currently LANGUAGE is ignored but should be set to `ruby'" > (let ((common > `( > ;; Slam all top level nodes to the left margin > ((parent-is "program") parent 0) > > ((node-is ")") parent 0) > ((node-is "end") grand-parent 0) > > ;; method parameters with and without '(' > ((query "(method_parameters \"(\" _ @indent)") first-sibling 1) > ((parent-is "method_parameters") first-sibling 0) > > > ((node-is "body_statement") parent ruby-ts-mode-indent-offset) > ((parent-is "body_statement") first-sibling 0) > ((parent-is "binary") first-sibling 0) > > ;; "when" list spread across multiple lines > ((n-p-gp "pattern" "when" "case") (nth-sibling 1) 0) > ((n-p-gp nil "then" "when") grand-parent ruby-ts-mode-indent-offset) > > ;; if / unless unless expressions > ((node-is "else") parent-bol 0) > ((node-is "elsif") parent-bol 0) > ((node-is "when") parent-bol 0) > ((parent-is "then") parent-bol ruby-ts-mode-indent-offset) > ((parent-is "else") parent-bol ruby-ts-mode-indent-offset) > ((parent-is "elsif") parent-bol ruby-ts-mode-indent-offset) > > ;; for, while, until loops > ((parent-is "do") grand-parent ruby-ts-mode-indent-offset) > > ;; Assignment of hash and array > ((n-p-gp "}" "hash" "assignment") grand-parent 0) > ((n-p-gp "pair" "hash" "assignment") grand-parent ruby-ts-mode-indent-offset) > ((n-p-gp "]" "array" "assignment") grand-parent 0) > ((n-p-gp ".*" "array" "assignment") grand-parent ruby-ts-mode-indent-offset) > > ;; hash and array other than assignments > ((node-is "}") first-sibling 0) > ((parent-is "hash") first-sibling 1) > ((node-is "]") first-sibling 0) > ((parent-is "array") first-sibling 1) > > ;; method call arguments with and without '(' > ((query "(argument_list \"(\" _ @indent)") first-sibling 1) > ((parent-is "argument_list") first-sibling 0) > > ))) > `((base ,@common)))) Just return the common when the indent style is removed :) > > (defun ruby-ts-mode--class-or-module-p (node) > "Predicate returns turthy if NODE is a class or module" > (string-match-p "class\\|module" (treesit-node-type node))) > > (defun ruby-ts-mode--get-name (node) > "Returns the text of the `name' field of NODE" > (treesit-node-text (treesit-node-child-by-field-name node "name"))) > > (defun ruby-ts-mode--full-name (node) > "Returns the fully qualified name of NODE" > (let* ((name (get-name node)) > (delimiter "#")) > (while (setq node (treesit-parent-until node #'ruby-ts-mode--class-or-module-p)) > (setq name (concat (get-name node) delimiter name)) > (setq delimiter "::")) > name)) > > (defun ruby-ts-mode--imenu-helper (node) > "Helper for `ruby-ts-mode--imenu' converting a treesit sparse tree > into a list of imenu ( name . pos ) nodes" > (let* ((ts-node (car node)) > (subtrees (mapcan #'ruby-ts-mode--imenu-helper (cdr node))) > (name (when ts-node > (ruby-ts-mode--full-name ts-node))) > (marker (when ts-node > (set-marker (make-marker) > (treesit-node-start ts-node))))) > (cond > ((or (null ts-node) (null name)) subtrees) > ;; Don't include the anonymous "class" and "module" nodes > ((string-match-p "(\"\\(class\\|module\\)\")" > (treesit-node-string ts-node)) > nil) > (subtrees > `((,name ,(cons name marker) ,@subtrees))) > (t > `((,name . ,marker)))))) > > ;; For now, this is going to work like ruby-mode and return a list of > ;; class, modules, def (methods), and alias. It is likely that this > ;; can be rigged to be easily extended. > (defun ruby-ts-mode--imenu () > "Return Imenu alist for the current buffer." > (let* ((root (treesit-buffer-root-node)) > (nodes (treesit-induce-sparse-tree root "^\\(method\\|alias\\|class\\|module\\)$"))) > (ruby-ts-mode--imenu-helper nodes))) > Are you sure we don't want more granularity than this? Why is everything in the same regexp? > (defun ruby-ts-mode--set-indent-style (language) > "Helper function to set the indentation style. > Currently LANGUAGE is ignored but should be set to `ruby'." > (let ((style > (if (functionp ruby-ts-mode-indent-style) > (funcall ruby-ts-mode-indent-style) > (pcase ruby-ts-mode-indent-style > ('base (alist-get 'base (ruby-ts-mode--indent-styles language))))))) > `((,language ,@style)))) > Remove this when indent style is removed. > (define-derived-mode ruby-ts-base-mode prog-mode "Ruby" > "Major mode for editing Ruby, powered by tree-sitter." > :syntax-table ruby-ts-mode--syntax-table > > ;; Navigation. > (setq-local treesit-defun-type-regexp > (regexp-opt '("method" > "singleton_method"))) > > ;; AFAIK, Ruby can not nest methods > (setq-local treesit-defun-prefer-top-level nil) > > ;; Imenu. > (setq-local imenu-create-index-function #'ruby-ts-mode--imenu) > > ;; seems like this could be defined when I know more how tree sitter > ;; works. > (setq-local which-func-functions nil) > > (setq-local treesit-font-lock-feature-list > '(( comment definition) > ( keyword preprocessor string type) > ( assignment constant escape-sequence label literal property ) > ( bracket delimiter error function operator variable))) > ) > > (define-derived-mode ruby-ts-mode ruby-ts-base-mode "Ruby" > "Major mode for editing Ruby, powered by tree-sitter." > :group 'ruby > > (unless (treesit-ready-p 'ruby) > (error "Tree-sitter for Ruby isn't available")) > > (treesit-parser-create 'ruby) > > ;; Comments. > (setq-local comment-start "# ") > (setq-local comment-end "") > (setq-local comment-start-skip "#+ *") > > (setq indent-tabs-mode ruby-ts-indent-tabs-mode) > > (setq-local treesit-simple-indent-rules > (ruby-ts-mode--set-indent-style 'ruby)) > > ;; Font-lock. > (setq-local treesit-font-lock-settings (ruby-ts-mode--font-lock-settings 'ruby)) > > (treesit-major-mode-setup)) > > ;; end of ruby-ts-mode.el Also when this is ready, also add an entry to the NEWS file along with an update to the build script in 'admin/notes/tree-sitter/build-module/' so that we can get the ruby language installed easily! Thanks for your effort! Theo