unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Yuan Fu <casouri@gmail.com>
To: Yoav Marco <yoavm448@gmail.com>
Cc: emacs-devel@gnu.org
Subject: Re: Tree-sitter integration on feature/tree-sitter
Date: Tue, 10 May 2022 10:54:53 -0700	[thread overview]
Message-ID: <011DA1A3-0FA8-4449-878A-FD6B336B0F1B@gmail.com> (raw)
In-Reply-To: <875ymdwf76.fsf@gmail.com>



> On May 10, 2022, at 8:43 AM, Yoav Marco <yoavm448@gmail.com> wrote:
> 
> I benchmarked query compilation reuse:
> 
> |   |                                      | no reuse (now) | reuse |
> | 1 | Fontify xdisp.c all at once          |          0.01s | 0.01s |
> | 2 | Fontify 60 next lines of xdisp.c ×10 |          0.10s | 0.00s |
> | 3 | Fontify 60 next lines till the end   |          6.06s | 0.01s |
> 
> 
> The patch to reuse the query is pretty dumb: if the char* for the query
> string didn't change from last time, it reuses the TSQuery object from
> last time instead of calling ts_new_query again. The patch is attached.
> 
> The elisp code for the benchmarks is also attached, but I'll give a
> summary here:
> 
> The queries are tree-sitter-langs' highlights.scm for C.
> 
> Benchmark 1 runs treesit-font-lock-fontify-region once on the entire
> buffer, meaning the query is compiled only once in both cases
> 
> Benchmark 2 runs treesit-font-lock-fontify-region on blocks of 60 lines,
> meaning the no reuse version has to compile the query 10 times even
> though nothing changes in the buffer or query.
> 
> Benchmark 3 is just 2 done all the way. xdisp.c has 36k lines, so the
> 6.06s is consistent
> (600 lines = 0.10s, multiply by 60 ⇒ 36k lines ~= 6.00s).
> 

I had a look and it’s a pretty sensible benchmark, and creating the query object taking a lot of time makes sense. But could you maybe run the benchmark under gprof and see what you get? Just curious.

> So, is caching worth it? I don't know. It definetily is if it's possible
> to do it internally without introducing a new object type. But I don't
> think that's possible without making a hash map or a complicated cache
> like the one for compiled regexps that compile_pattern uses in
> search.c.

Yeah using a single cache would probably result in a lot of misses since Emacs don’t fontify the whole buffer at once. We don’t necessarily need to use a hash map. I had a look at search.c and IIUC it uses an Emacs-wide array of 20 regex caches and links them into a linked list sorted by most-recently used, which doesn’t seem too bad? I think I can do something similar to that. Tho we might also want to allow users to pin some “persistent” cache, for example major mode font-locking and indent queries, as they are guaranteed to be reused a lot and are generally large (ie, slow to create). Maybe that’s unnecessary tho. And I wonder if there is a cheap & easy way to do caching buffer-locally…

Or maybe add an argument to query-capture that allow the user to specify whether they want the query to be cached, or assume user wants the query to be cached if the query is in string form rather than in sexp form.

Yuan


  reply	other threads:[~2022-05-10 17:54 UTC|newest]

Thread overview: 150+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-09 17:50 Tree-sitter integration on feature/tree-sitter Yoav Marco
2022-05-09 20:51 ` Yuan Fu
     [not found]   ` <87lev9wyll.fsf@gmail.com>
2022-05-10 15:20     ` Yoav Marco
2022-05-10 15:43   ` Yoav Marco
2022-05-10 17:54     ` Yuan Fu [this message]
2022-05-10 18:18       ` Yoav Marco
2022-05-10 19:58         ` Stefan Monnier
2022-05-10 23:11           ` Yuan Fu
2022-05-10 23:53             ` Yuan Fu
2022-05-11 11:10         ` Eli Zaretskii
2022-05-11 11:16           ` Yoav Marco
2022-05-11 14:20             ` Eli Zaretskii
2022-05-11 15:40               ` Yoav Marco
2022-05-11 16:27                 ` Eli Zaretskii
2022-05-11 20:14                   ` Yuan Fu
2022-05-11 20:25                     ` Yuan Fu
2022-05-12  5:19                       ` Eli Zaretskii
2022-05-12  6:10                         ` Yuan Fu
2022-05-12  7:12                           ` Eli Zaretskii
2022-05-12 15:18                         ` Stefan Monnier
2022-05-12 15:53                           ` Eli Zaretskii
2022-05-12  5:17                     ` Eli Zaretskii
2022-05-12  6:07                       ` Yuan Fu
2022-05-12 14:16                       ` Yoav Marco
2022-05-12 16:04                         ` Eli Zaretskii
2022-05-12 16:26                           ` Yoav Marco
2022-05-12 17:18                             ` Eli Zaretskii
2022-05-12 17:22                               ` Yoav Marco
2022-05-13  6:34                                 ` Eli Zaretskii
2022-05-13  8:04                                   ` Theodor Thornhill
2022-05-13  8:36                                     ` Yoav Marco
2022-05-13  9:46                                       ` Theodor Thornhill
2022-05-13 10:37                                     ` Eli Zaretskii
2022-05-13 10:52                                       ` Theodor Thornhill
2022-05-13  8:42                                   ` Yoav Marco
2022-05-13 10:41                                     ` Eli Zaretskii
2022-05-14  0:04                                       ` Yuan Fu
2022-06-16 19:16                                         ` Yuan Fu
2022-06-16 21:57                                           ` yoavm448
2022-06-17  1:10                                             ` Yuan Fu
2022-05-12 15:15                       ` Stefan Monnier
2022-05-15 19:20       ` chad
2022-05-15 19:26         ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2022-06-29 16:51 Abin Simon
2022-06-29 17:43 ` Yoav Marco
2022-06-30 11:21   ` Yoav Marco
2022-06-30 14:29     ` Abin Simon
2022-06-30 14:37       ` Yoav Marco
2022-06-28 16:08 Yoav Marco
2022-06-28 19:35 ` Yoav Marco
2022-06-29 15:35   ` Yuan Fu
2022-05-19  1:35 Kiong-Ge Liau
2022-05-19  1:35 Kiong-Ge Liau
2022-05-20  2:01 ` Yuan Fu
2022-06-16 19:03   ` Yuan Fu
2022-06-17  1:24     ` Po Lu
2022-06-18  0:09       ` Yuan Fu
2022-06-17  2:00     ` Ihor Radchenko
2022-06-17  5:23       ` Eli Zaretskii
2022-06-17 10:40         ` Ihor Radchenko
2022-06-17  6:15     ` Eli Zaretskii
2022-06-17  7:17       ` Yuan Fu
2022-06-17 10:37         ` Eli Zaretskii
2022-06-18  0:14           ` Yuan Fu
2022-06-18  6:22             ` Eli Zaretskii
2022-06-18  8:25               ` Yuan Fu
2022-06-18  8:50                 ` Eli Zaretskii
2022-06-18 20:07                   ` Yuan Fu
2022-06-19  5:39                     ` Eli Zaretskii
2022-06-20  3:00                       ` Yuan Fu
2022-06-20 11:44                         ` Eli Zaretskii
2022-06-20 20:01                           ` Yuan Fu
2022-06-21  2:26                             ` Eli Zaretskii
2022-06-21  4:39                               ` Yuan Fu
2022-06-21 10:18                                 ` Eli Zaretskii
2022-06-22  0:34                                   ` Yuan Fu
2022-06-17 11:06     ` Jostein Kjønigsen
2022-06-18  0:28       ` Yuan Fu
2022-06-18 20:57         ` Jostein Kjønigsen
2022-05-07  8:29 Yuan Fu
2022-05-07  8:44 ` Yuan Fu
2022-05-07  8:47 ` Theodor Thornhill
2022-05-07 17:59   ` Yuan Fu
2022-05-07 18:16     ` Theodor Thornhill
2022-05-07  9:04 ` Eli Zaretskii
2022-05-07  9:34   ` Theodor Thornhill
2022-05-07 18:33     ` Yuan Fu
2022-05-07 19:02       ` Theodor Thornhill
2022-05-07 18:27   ` Yuan Fu
2022-05-07 18:48     ` Eli Zaretskii
2022-05-07 19:00       ` Theodor Thornhill
2022-05-07 19:21         ` Eli Zaretskii
2022-05-07 19:11       ` Yuan Fu
2022-05-07 19:25         ` Eli Zaretskii
2022-05-07 20:00           ` Yuan Fu
2022-05-07 20:12             ` Theodor Thornhill
2022-05-07 21:24               ` Stefan Monnier
2022-05-07 22:02                 ` Theodor Thornhill
2022-05-08  6:18                 ` Eli Zaretskii
2022-05-08 12:05                   ` Dmitry Gutov
2022-05-08 12:16                     ` Stefan Monnier
2022-05-08 13:23                       ` Eli Zaretskii
2022-05-08 20:57                         ` Dmitry Gutov
2022-05-08 13:21                     ` Eli Zaretskii
2022-05-08 20:42                       ` Dmitry Gutov
2022-05-09 11:18                         ` Eli Zaretskii
2022-05-08  6:16               ` Eli Zaretskii
2022-05-08  6:49                 ` Theodor Thornhill
2022-05-08  6:58                   ` Eli Zaretskii
2022-05-08  9:02                     ` Theodor Thornhill
2022-05-08  9:09                       ` Theodor Thornhill
2022-05-08  9:10                       ` Eli Zaretskii
2022-05-08  9:19                         ` Theodor Thornhill
2022-05-08 10:33                           ` Eli Zaretskii
2022-05-08 13:47                             ` Theodor Thornhill
2022-05-08 13:58                               ` Eli Zaretskii
2022-05-08 14:01                               ` Stefan Monnier
2022-05-08 14:25                                 ` Theodor Thornhill
2022-05-08 14:42                                   ` Eli Zaretskii
2022-05-08 19:16                                     ` Theodor Thornhill
2022-05-08 21:14                                       ` Yuan Fu
2022-05-09 11:14                                       ` Eli Zaretskii
2022-05-09 12:20                                         ` Theodor Thornhill
2022-05-09 12:23                                           ` Eli Zaretskii
2022-05-09 21:10                                             ` Yuan Fu
2022-05-09 21:33                                               ` Theodor Thornhill
2022-05-14  0:03                                                 ` Yuan Fu
2022-05-14  5:03                                                   ` Theodor Thornhill
2022-05-14  5:13                                                     ` Yuan Fu
2022-05-17 21:45                                                       ` Theodor Thornhill
2022-05-18 20:52                                                         ` Yuan Fu
2022-05-18 21:07                                                           ` Theodor Thornhill
2022-06-16 19:09                                                             ` Yuan Fu
2022-06-17  6:19                                                               ` Eli Zaretskii
2022-06-17  7:32                                                                 ` Yuan Fu
2022-06-17 10:42                                                                   ` Eli Zaretskii
2022-06-18  0:20                                                                     ` Yuan Fu
2022-06-18  6:23                                                                       ` Eli Zaretskii
2022-06-20 14:20                                                                       ` Daniel Martín
2022-06-20 20:03                                                                         ` Yuan Fu
2022-06-17 18:12                                                                   ` Yoav Marco
2022-06-18  0:35                                                                     ` Yuan Fu
2022-06-18  8:15                                                                       ` Yoav Marco
2022-06-18 20:11                                                                         ` Yuan Fu
2022-05-08 22:42                             ` Stephen Leake
2022-05-14 15:09 ` Daniel Martín
2022-05-14 15:55   ` Yuan Fu
2022-05-14 18:50     ` Daniel Martín
2022-05-14 19:09       ` Eli Zaretskii
2022-06-16 19:10       ` Yuan Fu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=011DA1A3-0FA8-4449-878A-FD6B336B0F1B@gmail.com \
    --to=casouri@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=yoavm448@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).