unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Implementation direction for shell-script-mode with tree-sitter
@ 2022-10-25 15:05 João Paulo Labegalini de Carvalho
  2022-10-25 15:46 ` Stefan Monnier
  2022-10-26  0:52 ` Po Lu
  0 siblings, 2 replies; 10+ messages in thread
From: João Paulo Labegalini de Carvalho @ 2022-10-25 15:05 UTC (permalink / raw)
  To: emacs-devel, Yuan Fu, Theodor Thornhill, Eli Zaretskii


[-- Attachment #1.1: Type: text/plain, Size: 1244 bytes --]

Hi,

The tree-sitter-bash grammar does not include many reserved words and
builtin commands that are currently fontified by the regex based
fontication in shell-script-mode.

Here a list of the ones that tree-sitter-bash does not recognize:

("time" "coproc" "type" "trap" "exit" "exec" "continue" "break" "return"
"logout" "bye")

According to the Bash Reference Manual, all of the above are reserved words.

Should I make a PR to tree-sitter-bash to incorporate the missing keywords
or should I just filter them out of the list that I obtain through (and
other variables in `shell-script-mode'):

(append (sh-feature sh-leading-keywords)
        (sh-feature sh-other-keywords))

I am attaching the patch so everyone can see code and understand better
what I did. I welcome all criticism and feedback.

PS.: I am looking at the tree-sitter-bash and it does not seem very
complicated to extend it to recognize the missing keywords. But I can
definitely keep working independently of that.
-- 
João Paulo L. de Carvalho
Ph.D Computer Science |  IC-UNICAMP | Campinas , SP - Brazil
Postdoctoral Research Fellow | University of Alberta | Edmonton, AB - Canada
joao.carvalho@ic.unicamp.br
joao.carvalho@ualberta.ca

[-- Attachment #1.2: Type: text/html, Size: 1893 bytes --]

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: sh-script-treesit.patch --]
[-- Type: text/x-patch; charset="x-binaryenc"; name="sh-script-treesit.patch", Size: 5339 bytes --]

^[[33mdiff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el^[[m
^[[33mindex 558b62b20a..c7cc676843 100644^[[m
^[[33m--- a/lisp/progmodes/sh-script.el^[[m
^[[33m+++ b/lisp/progmodes/sh-script.el^[[m
^[[36m@@ -148,6 +148,7 @@^[[m
   (require 'let-alist)^[[m
   (require 'subr-x))^[[m
 (require 'executable)^[[m
^[[32m+^[[m^[[32m(require 'treesit)^[[m
 ^[[m
 (autoload 'comint-completion-at-point "comint")^[[m
 (autoload 'comint-filename-completion "comint")^[[m
^[[36m@@ -170,6 +171,12 @@^[[m ^[[msh-script^[[m
   :group 'sh^[[m
   :prefix "sh-")^[[m
 ^[[m
^[[32m+^[[m^[[32m(defcustom sh-script-use-tree-sitter nil^[[m
^[[32m+^[[m^[[32m  "If non-nil, `sh-script-mode' tries to use tree-sitter.^[[m
^[[32m+^[[m^[[32mCurrently `sh-script-mode' uses tree-sitter for font-locking, imenu,^[[m
^[[32m+^[[m^[[32mand movement functions."^[[m
^[[32m+^[[m^[[32m  :type 'boolean^[[m
^[[32m+^[[m^[[32m  :version "29.1")^[[m
 ^[[m
 (defcustom sh-ancestor-alist^[[m
   '((ash . sh)^[[m
^[[36m@@ -1534,13 +1541,24 @@^[[m ^[[msh-mode^[[m
   ;; we can't look if previous line ended with `\'^[[m
   (setq-local comint-prompt-regexp "^[ \t]*")^[[m
   (setq-local imenu-case-fold-search nil)^[[m
^[[31m-  (setq font-lock-defaults^[[m
^[[31m-	`((sh-font-lock-keywords^[[m
^[[31m-	   sh-font-lock-keywords-1 sh-font-lock-keywords-2)^[[m
^[[31m-	  nil nil^[[m
^[[31m-	  ((?/ . "w") (?~ . "w") (?. . "w") (?- . "w") (?_ . "w")) nil^[[m
^[[31m-	  (font-lock-syntactic-face-function^[[m
^[[31m-	   . ,#'sh-font-lock-syntactic-face-function)))^[[m
^[[32m+^[[m
^[[32m+^[[m^[[32m  (if (and sh-script-use-tree-sitter^[[m
^[[32m+^[[m^[[32m           (treesit-can-enable-p))^[[m
^[[32m+^[[m^[[32m      (progn^[[m
^[[32m+^[[m^[[32m        (setq-local font-lock-keywords-only t)^[[m
^[[32m+^[[m^[[32m        (setq-local treesit-font-lock-feature-list^[[m
^[[32m+^[[m^[[32m                    '((basic) (moderate) (elaborate)))^[[m
^[[32m+^[[m^[[32m        (setq-local treesit-font-lock-settings^[[m
^[[32m+^[[m^[[32m                    sh-script--treesit-settings)^[[m
^[[32m+^[[m^[[32m        (treesit-font-lock-enable))^[[m
^[[32m+^[[m^[[32m    (setq font-lock-defaults^[[m
^[[32m+^[[m^[[32m          `((sh-font-lock-keywords^[[m
^[[32m+^[[m^[[32m             sh-font-lock-keywords-1 sh-font-lock-keywords-2)^[[m
^[[32m+^[[m^[[32m            nil nil^[[m
^[[32m+^[[m^[[32m            ((?/ . "w") (?~ . "w") (?. . "w") (?- . "w") (?_ . "w")) nil^[[m
^[[32m+^[[m^[[32m            (font-lock-syntactic-face-function^[[m
^[[32m+^[[m^[[32m             . ,#'sh-font-lock-syntactic-face-function))))^[[m
^[[32m+^[[m
   (setq-local syntax-propertize-function #'sh-syntax-propertize-function)^[[m
   (add-hook 'syntax-propertize-extend-region-functions^[[m
             #'syntax-propertize-multiline 'append 'local)^[[m
^[[36m@@ -3191,6 +3209,51 @@^[[m ^[[msh-shellcheck-flymake^[[m
       (process-send-region sh--shellcheck-process (point-min) (point-max))^[[m
       (process-send-eof sh--shellcheck-process))))^[[m
 ^[[m
^[[31m-(provide 'sh-script)^[[m
^[[32m+^[[m^[[32m;;; Tree-sitter font-lock^[[m
^[[32m+^[[m
^[[32m+^[[m^[[32m(defvar sh-script--treesit-bash-keywords^[[m
^[[32m+^[[m^[[32m  '("case" "do" "done" "elif" "else" "esac" "export" "fi" "for"^[[m
^[[32m+^[[m^[[32m    "function" "if" "in" "unset" "while" "then"))^[[m
^[[32m+^[[m
^[[32m+^[[m^[[32m(defun sh-script--treesit-filtered-keywords (blacklist)^[[m
^[[32m+^[[m^[[32m  "Docstring goes here"^[[m
^[[32m+^[[m^[[32m  (let ((keywords (append (sh-feature sh-leading-keywords)^[[m
^[[32m+^[[m^[[32m                          (sh-feature sh-other-keywords)))^[[m
^[[32m+^[[m^[[32m        (filtered-list))^[[m
^[[32m+^[[m^[[32m    (dolist (item keywords filtered-list)^[[m
^[[32m+^[[m^[[32m      (if (not (member item blacklist))^[[m
^[[32m+^[[m^[[32m          (setq filtered-list (cons item filtered-list))^[[m
^[[32m+^[[m^[[32m        nil))))^[[m
^[[32m+^[[m
^[[32m+^[[m^[[32m(defvar sh-script--treesit-blacklisted-keywords^[[m
^[[32m+^[[m^[[32m  "Docstring goes here"^[[m
^[[32m+^[[m^[[32m  '("time" "coproc" "type" "trap" "exit" "exec" "continue" "break"^[[m
^[[32m+^[[m^[[32m  "return" "logout" "bye"))^[[m
^[[32m+^[[m
^[[32m+^[[m^[[32m(defvar sh-script--treesit-settings^[[m
^[[32m+^[[m^[[32m  (treesit-font-lock-rules^[[m
^[[32m+^[[m^[[32m   :language 'bash^[[m
^[[32m+^[[m^[[32m   :feature 'basic^[[m
^[[32m+^[[m^[[32m   '(;; Queries for function, strings, comments, and heredocs^[[m
^[[32m+^[[m^[[32m     (function_definition name: (word) @font-lock-function-name-face)^[[m
^[[32m+^[[m^[[32m     (comment) @font-lock-comment-face^[[m
^[[32m+^[[m^[[32m     [ (string) (raw_string)(heredoc_body) (heredoc_start) ] @font-lock-string-face)^[[m
^[[32m+^[[m^[[32m   :language 'bash^[[m
^[[32m+^[[m^[[32m   :feature 'moderate^[[m
^[[32m+^[[m^[[32m   :override t^[[m
^[[32m+^[[m^[[32m   `(;; Queries for keywords and builtin commands^[[m
^[[32m+^[[m^[[32m     [ ,@(sh-script--treesit-filtered-keywords sh-script--blacklisted-keywords) ] @font-lock-keyword-face^[[m
^[[32m+^[[m^[[32m     (command name: (command_name^[[m
^[[32m+^[[m^[[32m      ((word) @font-lock-builtin-face^[[m
^[[32m+^[[m^[[32m       (:match ,(let ((builtins (sh-feature sh-builtins)))^[[m
^[[32m+^[[m^[[32m                  (rx-to-string^[[m
^[[32m+^[[m^[[32m                   `(seq bol^[[m
^[[32m+^[[m^[[32m                         (or ,@builtins)^[[m
^[[32m+^[[m^[[32m                         eol)))^[[m
^[[32m+^[[m^[[32m               @font-lock-builtin-face))))^[[m
^[[32m+^[[m^[[32m     )^[[m
^[[32m+^[[m^[[32m   )^[[m
^[[32m+^[[m^[[32m  "Tree-sitter font-lock settings.")^[[m
 ^[[m
^[[32m+^[[m^[[32m(provide 'sh-script)^[[m
 ;;; sh-script.el ends here^[[m

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-10-27 15:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-25 15:05 Implementation direction for shell-script-mode with tree-sitter João Paulo Labegalini de Carvalho
2022-10-25 15:46 ` Stefan Monnier
2022-10-25 16:26   ` João Paulo Labegalini de Carvalho
2022-10-26  0:52 ` Po Lu
2022-10-26 15:48   ` João Paulo Labegalini de Carvalho
2022-10-27  0:54     ` Po Lu
2022-10-27  6:06       ` Eli Zaretskii
2022-10-27 14:23         ` João Paulo Labegalini de Carvalho
2022-10-27 15:53           ` Eli Zaretskii
2022-10-27 14:22       ` João Paulo Labegalini de Carvalho

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).