unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Jostein Kjønigsen" <jostein@secure.kjonigsen.net>
To: Randy Taylor <dev@rjt.dev>
Cc: Yuan Fu <casouri@gmail.com>, Eli Zaretskii <eliz@gnu.org>,
	Juri Linkov <juri@linkov.net>, emacs-devel <emacs-devel@gnu.org>,
	theo@thornhill.no
Subject: Re: toml-ts-mode: first draft
Date: Tue, 13 Dec 2022 21:43:28 +0100	[thread overview]
Message-ID: <f764696a-2850-640d-7143-a3d0dda37894@secure.kjonigsen.net> (raw)
In-Reply-To: <haz3VzxS_89mRdi4Dxp7Eh-Xq2n9trPEzuxE6lPtzQaAkjCOvv43yTkH3CHa51xvZoLEVgrumSZgW6uLiRzw-Y6rFqsB9gHjCL_JqotZVh8=@rjt.dev>


[-- Attachment #1.1: Type: text/plain, Size: 2076 bytes --]

On 12.12.2022 22:17, Randy Taylor wrote:
> Looks good! A few silly nits:

Thanks for the constructive feedback!

> - It would be nice to keep batch.sh alphabetized (so maybe move 
> typescript while you're there).
Did this. I saw bash was missing, so I added that too. It's unrelated to 
TOML, but I hope it can pass :)
> - Most modes put a newline between features in their font-lock rules 
> definition. I think we should stick to that.
Done
> - I think comment should be moved out of pair to its own feature.
Done
> - For features like 'number, I like to group them (e.g. [(int) 
> (float]), then you only need to specify @font-lock-number-face once.
Done
> - ;;(setq global-toml-node (treesit-buffer-root-node)) seems like this 
> was leftover debugging to be removed?
Oops. Fixed.
> - treesit-font-lock-feature-list should have 4 levels, and delimiter 
> and error should probably go in the 4th one (side note, we should all 
> figure out the "final" list of general features and which levels they 
> belong to). The first level should maybe just be comment on its own, 
> the rest looks good to me.
Done.
> - Indentation support for multi-line arrays would be nice (and maybe 
> even follow the indentation of the previous line if that's not too 
> hard and doesn't cause everything to blow up?)

I was fine with all this until you started mentioning indentation... :D

I gave it a try though, and what we have provides a customizable 
indentation-level, which is applied to multiline strings and 
array-values.  (Indentation was never my "forte" if you like, and I 
haven't figured out an obvious way to make it follow previous line's 
indentation though.)

If it's OK for you, for now I would like to leave the 
indentation-ambitions at the point which is implemented.

Aaand...

With that said... That should (to the best of my knowledge) address 
everything you requested, and IMO that makes it a nice upgrade from last 
patch.

Attached is a patch with all changes combined up until now.

Anything else you (or anyone else) think should be fixed up?

--
Jostein

[-- Attachment #1.2: Type: text/html, Size: 5948 bytes --]

[-- Attachment #2: 0005-Introduce-support-for-TOML-config-format.patch --]
[-- Type: text/x-patch, Size: 7863 bytes --]

From bfb5cc253faf9ed9f7c9256df035200debaf931c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jostein=20Kj=C3=B8nigsen?= <jostein@kjonigsen.net>
Date: Sun, 11 Dec 2022 13:05:29 +0100
Subject: [PATCH 5/5] Introduce support for TOML config-format

This commit introduces support for the semi-popular TOML
config-format[1] through a new major-mode: toml-ts-mode.

I've read through the full spec[2], and from what I can see this
major-mode should provide correct syntax-highligting for every sort of
config-declaration which adheres to the specification.

Besides that it also adds support for imenu and basic tree-sitter
based navigation.

[1] https://toml.io/en/
[2] https://toml.io/en/v1.0.0
---
 admin/notes/tree-sitter/build-module/batch.sh |   4 +-
 lisp/textmodes/toml-ts-mode.el                | 188 ++++++++++++++++++
 2 files changed, 191 insertions(+), 1 deletion(-)
 create mode 100644 lisp/textmodes/toml-ts-mode.el

diff --git a/admin/notes/tree-sitter/build-module/batch.sh b/admin/notes/tree-sitter/build-module/batch.sh
index 6dce000caa6..2b8367fe6db 100755
--- a/admin/notes/tree-sitter/build-module/batch.sh
+++ b/admin/notes/tree-sitter/build-module/batch.sh
@@ -1,6 +1,7 @@
 #!/bin/bash
 
 languages=(
+    'bash'
     'c'
     'cpp'
     'css'
@@ -12,8 +13,9 @@ languages=
     'json'
     'python'
     'rust'
-    'typescript'
+    'toml'
     'tsx'
+    'typescript'
 )
 
 for language in "${languages[@]}"
diff --git a/lisp/textmodes/toml-ts-mode.el b/lisp/textmodes/toml-ts-mode.el
new file mode 100644
index 00000000000..c0a6fe9c0b0
--- /dev/null
+++ b/lisp/textmodes/toml-ts-mode.el
@@ -0,0 +1,188 @@
+;;; toml-ts-mode.el --- tree-sitter support for TOML  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; Author     : Jostein Kjønigsen <jostein@kjonigsen.net>
+;; Maintainer : Jostein Kjønigsen <jostein@kjonigsen.net>
+;; Created    : December 2022
+;; Keywords   : toml languages tree-sitter
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+;;
+
+;;; Code:
+
+(require 'treesit)
+
+(declare-function treesit-parser-create "treesit.c")
+(declare-function treesit-induce-sparse-tree "treesit.c")
+(declare-function treesit-node-start "treesit.c")
+(declare-function treesit-node-child-by-field-name "treesit.c")
+
+(defcustom toml-ts-mode-indent-offset 2
+  "Number of spaces for each indentation step in `toml-ts-mode'."
+  :version "29.1"
+  :type 'integer
+  :safe 'integerp
+  :group 'toml)
+
+(defvar toml-ts-mode--syntax-table
+  (let ((table (make-syntax-table)))
+    (modify-syntax-entry ?_  "_"     table)
+    (modify-syntax-entry ?\\ "\\"    table)
+    (modify-syntax-entry ?=  "."     table)
+    (modify-syntax-entry ?\' "\""    table)
+    (modify-syntax-entry ?#  "<"   table)
+    (modify-syntax-entry ?\n "> b"  table)
+    (modify-syntax-entry ?\^m "> b" table)
+    table)
+  "Syntax table for `toml-ts-mode'.")
+
+(defvar toml-ts--indent-rules
+  `((toml
+     ((node-is "]") parent-bol 0)
+     ((parent-is "string") parent-bol toml-ts-mode-indent-offset)
+     ((parent-is "array") parent-bol toml-ts-mode-indent-offset))))
+
+(defvar toml-ts-mode--font-lock-settings
+  (treesit-font-lock-rules
+   :language 'toml
+   :feature 'comment
+   '((comment) @font-lock-comment-face)
+
+   :language 'toml
+   :feature 'constant
+   '((boolean) @font-lock-constant-face)
+
+   :language 'toml
+   :feature 'delimiter
+   '((["="]) @font-lock-delimiter-face)
+
+   :language 'toml
+   :feature 'number
+   '([(integer) (float) (local_date) (local_date_time) (local_time)]
+     @font-lock-number-face)
+
+   :language 'toml
+   :feature 'string
+   '((string) @font-lock-string-face)
+
+   :language 'toml
+   :feature 'escape-sequence
+   :override t
+   '((escape_sequence) @font-lock-escape-face)
+
+   :language 'toml
+   :feature 'pair
+   :override t            ; Needed for overriding string face on keys.
+   '((bare_key) @font-lock-property-face
+     (quoted_key) @font-lock-property-face
+     (table ("[" @font-lock-bracket-face
+             (_) @font-lock-type-face
+             "]" @font-lock-bracket-face))
+     (table_array_element ("[[" @font-lock-bracket-face
+                           (_) @font-lock-type-face
+                           "]]" @font-lock-bracket-face))
+     (table (quoted_key) @font-lock-type-face)
+     (table (dotted_key (quoted_key)) @font-lock-type-face))
+
+   :language 'toml
+   :feature 'error
+   :override t
+   '((ERROR) @font-lock-warning-face))
+  "Font-lock settings for TOML.")
+
+(defun toml-ts-mode--get-table-name (node)
+  "Obtains the header-name for the associated tree-sitter `NODE'."
+  (if node
+      (treesit-node-text
+       (car (cdr (treesit-node-children node))))
+    "Root table"))
+
+(defun toml-ts-mode--imenu-1 (node)
+  "Helper for `toml-ts-mode--imenu'.
+Find string representation for NODE and set marker, then recurse
+the subtrees."
+  (let* ((ts-node (car node))
+         (subtrees (mapcan #'toml-ts-mode--imenu-1 (cdr node)))
+         (name (toml-ts-mode--get-table-name ts-node))
+         (marker (when ts-node
+                   (set-marker (make-marker)
+                               (treesit-node-start ts-node)))))
+    (cond
+     ((null ts-node) subtrees)
+     (subtrees
+      `((,name ,(cons name marker) ,@subtrees)))
+     (t
+      `((,name . ,marker))))))
+
+(defun toml-ts-mode--imenu ()
+  "Return Imenu alist for the current buffer."
+  (let* ((node (treesit-buffer-root-node))
+         (table-tree (treesit-induce-sparse-tree
+                      node "^table$" nil 1000))
+         (table-array-tree (treesit-induce-sparse-tree
+                            node "^table_array_element$" nil 1000))
+         (table-index (toml-ts-mode--imenu-1 table-tree))
+         (table-array-index (toml-ts-mode--imenu-1 table-array-tree)))
+    (append
+     (when table-index `(("Headers" . ,table-index)))
+     (when table-array-index `(("Arrays" . ,table-array-index))))))
+
+
+;;;###autoload
+(add-to-list 'auto-mode-alist '("\\.toml\\'" . toml-ts-mode))
+
+;;;###autoload
+(define-derived-mode toml-ts-mode text-mode "TOML"
+  "Major mode for editing TOML, powered by tree-sitter."
+  :group 'toml-mode
+  :syntax-table toml-ts-mode--syntax-table
+
+  (when (treesit-ready-p 'toml)
+    (treesit-parser-create 'toml)
+
+    ;; Comments
+    (setq-local comment-start "# ")
+    (setq-local commend-end "")
+
+    ;; Indent.
+    (setq-local treesit-simple-indent-rules toml-ts--indent-rules)
+
+    ;; Navigation.
+    (setq-local treesit-defun-type-regexp
+                (rx (or "table" "table_array_element")))
+
+    ;; Font-lock.
+    (setq-local treesit-font-lock-settings toml-ts-mode--font-lock-settings)
+    (setq-local treesit-font-lock-feature-list
+                '((comment)
+                  (constant number pair string)
+                  (escape-sequence)
+                  (delimiter error)))
+    (setq-local treesit-font-lock-level 4)
+
+    ;; Imenu.
+    (setq-local imenu-create-index-function #'toml-ts-mode--imenu)
+    (setq-local which-func-functions nil) ;; Piggyback on imenu
+
+    (treesit-major-mode-setup)))
+
+(provide 'toml-ts-mode)
+
+;;; toml-ts-mode.el ends here
-- 
2.37.2


  reply	other threads:[~2022-12-13 20:43 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-11 13:28 toml-ts-mode: first draft Jostein Kjønigsen
2022-12-11 17:09 ` Juri Linkov
2022-12-11 17:23   ` Jostein Kjønigsen
2022-12-11 17:40   ` Eli Zaretskii
2022-12-11 18:19     ` Stefan Kangas
2022-12-11 18:23       ` Eli Zaretskii
2022-12-11 21:43         ` Stefan Kangas
2022-12-12  3:28           ` Eli Zaretskii
2022-12-12 17:04       ` Juri Linkov
2022-12-11 19:56     ` Jostein Kjønigsen
2022-12-11 20:07       ` Eli Zaretskii
2022-12-11 20:31         ` Jostein Kjønigsen
2022-12-11 20:38           ` Eli Zaretskii
2022-12-11 20:49             ` Jostein Kjønigsen
2022-12-11 23:01       ` Yuan Fu
2022-12-12 13:10         ` Jostein Kjønigsen
2022-12-12 13:53           ` Theodor Thornhill
2022-12-12 20:41         ` Jostein Kjønigsen
2022-12-12 21:17           ` Randy Taylor
2022-12-13 20:43             ` Jostein Kjønigsen [this message]
2022-12-13 22:37               ` Randy Taylor
2022-12-14  8:40                 ` Jostein Kjønigsen
2022-12-14 13:24                   ` Randy Taylor
2022-12-14 18:53                     ` toml-ts-mode (code-review done) Jostein Kjønigsen
2022-12-14 19:02                       ` Theodor Thornhill
2022-12-14 20:37                         ` Yuan Fu
2022-12-14 22:02                           ` Jostein Kjønigsen
2022-12-15  2:24                             ` Randy Taylor
2022-12-15 12:52                               ` Jostein Kjønigsen
2022-12-15 13:22                                 ` Theodor Thornhill
2022-12-15 13:45                                   ` Jostein Kjønigsen
2022-12-15 14:22                                     ` Eli Zaretskii
2022-12-15 14:28                                       ` Jostein Kjønigsen
2022-12-13 10:45         ` toml-ts-mode: first draft Rudolf Schlatte
2022-12-13 13:20           ` Eli Zaretskii
2022-12-13 14:22             ` Rudi Schlatte

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f764696a-2850-640d-7143-a3d0dda37894@secure.kjonigsen.net \
    --to=jostein@secure.kjonigsen.net \
    --cc=casouri@gmail.com \
    --cc=dev@rjt.dev \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=jostein@kjonigsen.net \
    --cc=juri@linkov.net \
    --cc=theo@thornhill.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).