unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Yuan Fu <casouri@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: Yoav Marco <yoavm448@gmail.com>, emacs-devel@gnu.org
Subject: Re: Tree-sitter integration on feature/tree-sitter
Date: Wed, 11 May 2022 13:14:33 -0700	[thread overview]
Message-ID: <8F6A43D1-D1EA-4602-A245-627DB7960FC2@gmail.com> (raw)
In-Reply-To: <83ee10qbk7.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 2354 bytes --]

> 
> And the timings are in the table below?
> 
>  |   |                                      | no reuse (now) | reuse |
>  | 1 | Fontify xdisp.c all at once          |          0.01s | 0.01s |
>  | 2 | Fontify 60 next lines of xdisp.c ×10 |          0.10s | 0.00s |
>  | 3 | Fontify 60 next lines till the end   |          6.06s | 0.01s |
> 
> If so, what is the significance of the last line in practical use
> cases?  JIT font-lock never fontifies such large chunks of source
> code, it does that in 512-character chunks, which is less than 60
> lines in most cases, and definitely not "till the end".

I think that’s just a way to run font-lock enough times without repeatedly fontifying the same region?

> 
> Also, how much time does it take to do the same with the current
> regexp- and syntax-based font-lock, for the same chunks of text?
> 
> We need to examine the use cases and the absolute numbers carefully
> before we conclude that any kind of caching is needed and/or
> justified.
> 

I redid the benchmark, but without his reuse patch, just to see how much time is spent on creating query objects. So fortifying 40 lines for 463 times takes 6.92s (according to Emacs, 7.30s according to the profiler). That counts to 0.0158s per call to font-lock-region, of which 0.0104s is spent on creating the query object. That seems to tell me if we optimize away the query object creation we can make font-locking very very fast? And not just font-locking, since using tree-sitter to do anything useful basically means querying the parsed tree.

If we expose "compiled query” we don’t need to cache them either.

The regex-based font-lock is a lot slower. With the optimization or not tree-sitter is a win, but we know that already. I have no idea why regex font-lock ran for 905 loops comparing to 463 for tree-sitter. Maybe I did something wrong there.

Benchmark 3: fontify all of xdisp.c, 40 lines at a time.
took 6.92, of which 1.00 is GC (0 gc runs), loop count: 463

font-lock:    7.30s -> 0.015766738660907127 / loop
ts_query_new: 4.80s -> 0.010367170626349892s / loop

Note: 7.30 is taken from external profiler.

Benchmark 3: fontify all of xdisp.c, 40 lines at a time.
took 88.28, of which 5.00 is GC (4 gc runs), loop count: 905

font-lock: 88.28s -> 0.1997285067873303 / loop

Yuan


[-- Attachment #2: tree-sitter-benchmark.el --]
[-- Type: application/octet-stream, Size: 1673 bytes --]

;;; tree-sitter-benchmark.el -*- lexical-binding: t; -*-

(require 'treesit)
(setq c-font-lock-settings-1
      `((c
         ,(with-temp-buffer
            (insert-file-contents-literally "./highlights.scm")
            ;; make capture names map to a face, any face
            (goto-char (point-min))
            (while (re-search-forward "@[a-z.]+" nil t)
              (replace-match "@font-lock-string-face" t))
            (buffer-substring (point-min) (point-max))))))

(with-temp-buffer
  (treesit-get-parser-create 'c)
  (setq-local treesit-font-lock-defaults
              '((c-font-lock-settings-1)))
  (font-lock-mode)
  (treesit-font-lock-enable)
  (insert-file-contents "xdisp.c")
  (let ((count 0))
    (apply #'message
           "Benchmark 3: fontify all of xdisp.c, 40 lines at a time.\
  took %2.2f, of which %2.2f is GC (%d gc runs), loop count: %s"
           (append
            (benchmark-run 1
              (while (/= (point-max) (point))
                (font-lock-fontify-region (point) (line-end-position 40))
                (forward-line 40)
                (cl-incf count)))
            (list count)))))

(with-temp-buffer
  (treesit-get-parser-create 'c)
  (c-mode)
  (insert-file-contents "xdisp.c")
  (let ((count 0))
    (apply #'message
           "Benchmark 3: fontify all of xdisp.c, 40 lines at a time.\
  took %2.2f, of which %2.2f is GC (%d gc runs), loop count: %s"
           (append
            (benchmark-run 1
              (while (/= (point-max) (point))
                (font-lock-fontify-region (point) (line-end-position 40))
                (forward-line 40)
                (cl-incf count)))
            (list count)))))

[-- Attachment #3: highlights.scm --]
[-- Type: application/octet-stream, Size: 3299 bytes --]

;; Copied from elisp-tree-sitter/langs/queries/c
["break"
 "case"
 "const"
 "continue"
 "default"
 "do"
 "else"
 "enum"
 "extern"
 "for"
 "if"
 "inline"
 "return"
 "sizeof"
 "static"
 "struct"
 "switch"
 "typedef"
 "union"
 "volatile"
 "while"
 "..."] @keyword

[(storage_class_specifier)
 (type_qualifier)] @keyword

["#define"
 "#else"
 "#endif"
 "#if"
 "#ifdef"
 "#ifndef"
 "#include"
 (preproc_directive)] @function.macro

((["#ifdef" "#ifndef"] (identifier) @constant))

["+" "-" "*" "/" "%"
 "~" "|" "&" "<<" ">>"
 "!" "||" "&&"
 "->"
 "==" "!=" "<" ">" "<=" ">="
 "=" "+=" "-=" "*=" "/=" "%=" "|=" "&="
 "++" "--"
] @operator

(conditional_expression ["?" ":"] @operator)

["(" ")" "[" "]" "{" "}"] @punctuation.bracket

["." "," ";"] @punctuation.delimiter

;;; ----------------------------------------------------------------------------
;;; Functions.

(call_expression
 function: [(identifier) @function.call
            (field_expression field: (_) @method.call)])

(function_declarator
 declarator: [(identifier) @function
              (parenthesized_declarator
               (pointer_declarator (field_identifier) @function))])

(preproc_function_def
 name: (identifier) @function)

;;; ----------------------------------------------------------------------------
;;; Types.

[(primitive_type)
 (sized_type_specifier)] @type.builtin

(type_identifier) @type

;;; ----------------------------------------------------------------------------
;;; Variables.

(declaration declarator: [(identifier) @variable
                          (_ (identifier) @variable)])

(parameter_declaration declarator: [(identifier) @variable.parameter
                                    (_ (identifier) @variable.parameter)])

(init_declarator declarator: [(identifier) @variable
                              (_ (identifier) @variable)])

(assignment_expression
 left: [(identifier) @variable
        (field_expression field: (_) @variable)
        (subscript_expression argument: (identifier) @variable)
        (pointer_expression (identifier) @variable)])

(update_expression
 argument: (identifier) @variable)

(preproc_def name: (identifier) @variable.special)

(preproc_params
 (identifier) @variable.parameter)

;;; ----------------------------------------------------------------------------
;;; Properties.

(field_declaration
 declarator: [(field_identifier) @property.definition
              (pointer_declarator (field_identifier) @property.definition)
              (pointer_declarator (pointer_declarator (field_identifier) @property.definition))])

(enumerator name: (identifier) @property.definition)

(field_identifier) @property

;;; ----------------------------------------------------------------------------
;;; Misc.

;; Doesn't work right now: results in error Query pattern is malformed: "Cannot
;; find captured node", "^[A-Z_][A-Z_\\d]*$", "A predicate can only refer to
;; captured nodes in the same pattern"
;; ((identifier) @constant
;;  (.match @constant "^[A-Z_][A-Z_\\d]*$"))

[(null) (true) (false)] @constant.builtin

[(number_literal)
 (char_literal)] @number

(statement_identifier) @label

;;; ----------------------------------------------------------------------------
;;; Strings and comments.

(comment) @comment

[(string_literal)
 (system_lib_string)] @string

[-- Attachment #4: Type: text/plain, Size: 2 bytes --]




  reply	other threads:[~2022-05-11 20:14 UTC|newest]

Thread overview: 150+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-09 17:50 Tree-sitter integration on feature/tree-sitter Yoav Marco
2022-05-09 20:51 ` Yuan Fu
     [not found]   ` <87lev9wyll.fsf@gmail.com>
2022-05-10 15:20     ` Yoav Marco
2022-05-10 15:43   ` Yoav Marco
2022-05-10 17:54     ` Yuan Fu
2022-05-10 18:18       ` Yoav Marco
2022-05-10 19:58         ` Stefan Monnier
2022-05-10 23:11           ` Yuan Fu
2022-05-10 23:53             ` Yuan Fu
2022-05-11 11:10         ` Eli Zaretskii
2022-05-11 11:16           ` Yoav Marco
2022-05-11 14:20             ` Eli Zaretskii
2022-05-11 15:40               ` Yoav Marco
2022-05-11 16:27                 ` Eli Zaretskii
2022-05-11 20:14                   ` Yuan Fu [this message]
2022-05-11 20:25                     ` Yuan Fu
2022-05-12  5:19                       ` Eli Zaretskii
2022-05-12  6:10                         ` Yuan Fu
2022-05-12  7:12                           ` Eli Zaretskii
2022-05-12 15:18                         ` Stefan Monnier
2022-05-12 15:53                           ` Eli Zaretskii
2022-05-12  5:17                     ` Eli Zaretskii
2022-05-12  6:07                       ` Yuan Fu
2022-05-12 14:16                       ` Yoav Marco
2022-05-12 16:04                         ` Eli Zaretskii
2022-05-12 16:26                           ` Yoav Marco
2022-05-12 17:18                             ` Eli Zaretskii
2022-05-12 17:22                               ` Yoav Marco
2022-05-13  6:34                                 ` Eli Zaretskii
2022-05-13  8:04                                   ` Theodor Thornhill
2022-05-13  8:36                                     ` Yoav Marco
2022-05-13  9:46                                       ` Theodor Thornhill
2022-05-13 10:37                                     ` Eli Zaretskii
2022-05-13 10:52                                       ` Theodor Thornhill
2022-05-13  8:42                                   ` Yoav Marco
2022-05-13 10:41                                     ` Eli Zaretskii
2022-05-14  0:04                                       ` Yuan Fu
2022-06-16 19:16                                         ` Yuan Fu
2022-06-16 21:57                                           ` yoavm448
2022-06-17  1:10                                             ` Yuan Fu
2022-05-12 15:15                       ` Stefan Monnier
2022-05-15 19:20       ` chad
2022-05-15 19:26         ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2022-06-29 16:51 Abin Simon
2022-06-29 17:43 ` Yoav Marco
2022-06-30 11:21   ` Yoav Marco
2022-06-30 14:29     ` Abin Simon
2022-06-30 14:37       ` Yoav Marco
2022-06-28 16:08 Yoav Marco
2022-06-28 19:35 ` Yoav Marco
2022-06-29 15:35   ` Yuan Fu
2022-05-19  1:35 Kiong-Ge Liau
2022-05-19  1:35 Kiong-Ge Liau
2022-05-20  2:01 ` Yuan Fu
2022-06-16 19:03   ` Yuan Fu
2022-06-17  1:24     ` Po Lu
2022-06-18  0:09       ` Yuan Fu
2022-06-17  2:00     ` Ihor Radchenko
2022-06-17  5:23       ` Eli Zaretskii
2022-06-17 10:40         ` Ihor Radchenko
2022-06-17  6:15     ` Eli Zaretskii
2022-06-17  7:17       ` Yuan Fu
2022-06-17 10:37         ` Eli Zaretskii
2022-06-18  0:14           ` Yuan Fu
2022-06-18  6:22             ` Eli Zaretskii
2022-06-18  8:25               ` Yuan Fu
2022-06-18  8:50                 ` Eli Zaretskii
2022-06-18 20:07                   ` Yuan Fu
2022-06-19  5:39                     ` Eli Zaretskii
2022-06-20  3:00                       ` Yuan Fu
2022-06-20 11:44                         ` Eli Zaretskii
2022-06-20 20:01                           ` Yuan Fu
2022-06-21  2:26                             ` Eli Zaretskii
2022-06-21  4:39                               ` Yuan Fu
2022-06-21 10:18                                 ` Eli Zaretskii
2022-06-22  0:34                                   ` Yuan Fu
2022-06-17 11:06     ` Jostein Kjønigsen
2022-06-18  0:28       ` Yuan Fu
2022-06-18 20:57         ` Jostein Kjønigsen
2022-05-07  8:29 Yuan Fu
2022-05-07  8:44 ` Yuan Fu
2022-05-07  8:47 ` Theodor Thornhill
2022-05-07 17:59   ` Yuan Fu
2022-05-07 18:16     ` Theodor Thornhill
2022-05-07  9:04 ` Eli Zaretskii
2022-05-07  9:34   ` Theodor Thornhill
2022-05-07 18:33     ` Yuan Fu
2022-05-07 19:02       ` Theodor Thornhill
2022-05-07 18:27   ` Yuan Fu
2022-05-07 18:48     ` Eli Zaretskii
2022-05-07 19:00       ` Theodor Thornhill
2022-05-07 19:21         ` Eli Zaretskii
2022-05-07 19:11       ` Yuan Fu
2022-05-07 19:25         ` Eli Zaretskii
2022-05-07 20:00           ` Yuan Fu
2022-05-07 20:12             ` Theodor Thornhill
2022-05-07 21:24               ` Stefan Monnier
2022-05-07 22:02                 ` Theodor Thornhill
2022-05-08  6:18                 ` Eli Zaretskii
2022-05-08 12:05                   ` Dmitry Gutov
2022-05-08 12:16                     ` Stefan Monnier
2022-05-08 13:23                       ` Eli Zaretskii
2022-05-08 20:57                         ` Dmitry Gutov
2022-05-08 13:21                     ` Eli Zaretskii
2022-05-08 20:42                       ` Dmitry Gutov
2022-05-09 11:18                         ` Eli Zaretskii
2022-05-08  6:16               ` Eli Zaretskii
2022-05-08  6:49                 ` Theodor Thornhill
2022-05-08  6:58                   ` Eli Zaretskii
2022-05-08  9:02                     ` Theodor Thornhill
2022-05-08  9:09                       ` Theodor Thornhill
2022-05-08  9:10                       ` Eli Zaretskii
2022-05-08  9:19                         ` Theodor Thornhill
2022-05-08 10:33                           ` Eli Zaretskii
2022-05-08 13:47                             ` Theodor Thornhill
2022-05-08 13:58                               ` Eli Zaretskii
2022-05-08 14:01                               ` Stefan Monnier
2022-05-08 14:25                                 ` Theodor Thornhill
2022-05-08 14:42                                   ` Eli Zaretskii
2022-05-08 19:16                                     ` Theodor Thornhill
2022-05-08 21:14                                       ` Yuan Fu
2022-05-09 11:14                                       ` Eli Zaretskii
2022-05-09 12:20                                         ` Theodor Thornhill
2022-05-09 12:23                                           ` Eli Zaretskii
2022-05-09 21:10                                             ` Yuan Fu
2022-05-09 21:33                                               ` Theodor Thornhill
2022-05-14  0:03                                                 ` Yuan Fu
2022-05-14  5:03                                                   ` Theodor Thornhill
2022-05-14  5:13                                                     ` Yuan Fu
2022-05-17 21:45                                                       ` Theodor Thornhill
2022-05-18 20:52                                                         ` Yuan Fu
2022-05-18 21:07                                                           ` Theodor Thornhill
2022-06-16 19:09                                                             ` Yuan Fu
2022-06-17  6:19                                                               ` Eli Zaretskii
2022-06-17  7:32                                                                 ` Yuan Fu
2022-06-17 10:42                                                                   ` Eli Zaretskii
2022-06-18  0:20                                                                     ` Yuan Fu
2022-06-18  6:23                                                                       ` Eli Zaretskii
2022-06-20 14:20                                                                       ` Daniel Martín
2022-06-20 20:03                                                                         ` Yuan Fu
2022-06-17 18:12                                                                   ` Yoav Marco
2022-06-18  0:35                                                                     ` Yuan Fu
2022-06-18  8:15                                                                       ` Yoav Marco
2022-06-18 20:11                                                                         ` Yuan Fu
2022-05-08 22:42                             ` Stephen Leake
2022-05-14 15:09 ` Daniel Martín
2022-05-14 15:55   ` Yuan Fu
2022-05-14 18:50     ` Daniel Martín
2022-05-14 19:09       ` Eli Zaretskii
2022-06-16 19:10       ` Yuan Fu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8F6A43D1-D1EA-4602-A245-627DB7960FC2@gmail.com \
    --to=casouri@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=yoavm448@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).