bug#62368: 29.0.60; Evaluating predicates before creating captured nodes in treesit-query-capture

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Dmitry Gutov <dgutov@yandex.ru>
To: Yuan Fu <casouri@gmail.com>, 62368@debbugs.gnu.org
Subject: bug#62368: 29.0.60; Evaluating predicates before creating captured nodes in treesit-query-capture
Date: Thu, 23 Mar 2023 02:42:20 +0200	[thread overview]
Message-ID: <ad696668-a124-2bd5-415f-e438ee9f3526@yandex.ru> (raw)
In-Reply-To: <09A7EAB7-332E-4123-A6DB-8921FBD325C4@gmail.com>

Hi Yuan!

On 22/03/2023 06:49, Yuan Fu wrote:
> X-Debbugs-CC:dgutov@yandex.ru
> 
> Dmitry, when you have time, could you try your benchmark in bug#60953
> with this patch? I made predicates evaluate before we create any nodes,
> so #equal and #match should be more efficient now, when there are a lot
> of rejections. In the same time #pred is made slightly worst since they
> now create a lisp node and discard it. (But this can be fixed with a
> little more complexity.)

Thank you, I was curious what would the improvement be if we could delay 
allocation of node structures until :match is checked.

But for my benchmark the difference is on the order of 4-5%. It seems we 
are scraping the barrel in terms of improving allocations/reducing GC 
because according to 'benchmark-run', where the whole run of a 100 
iterations of the scenario takes ~1.1s, the time spent in GC is 0.150s. 
And the improved version takes like 1.04s, with 0.1s in GC.

So if you ask me, I think I'd prefer to hold off on applying this patch 
until we either find scenarios where the improvement is more 
significant, or we find and eliminate some other bigger bottleneck 
first, after which these 5% grow to become 10-20% or more, of remaining 
runtime. The current approach is pretty Lisp-y, so I kind of like it.

And there's the issue of #pred, of course, which which could swing the 
difference in the other direction (I didn't test any code which uses it).

We could also try a smaller change: where the initial list of conses for 
result is build with capture_id's in car's, and then substituted with 
capture_name if the predicates all match. Then tthe treesit_node 
pseudovectors would still be created eagerly, though.

Here's the current perf report for my benchmark, most of the time is 
spent in libtree-sitter:

   17.02%  emacs         libtree-sitter.so.0.0       [.] 
ts_tree_cursor_current_status              ◆
   10.94%  emacs         libtree-sitter.so.0.0       [.] 
ts_tree_cursor_goto_next_sibling           ▒
    9.93%  emacs         libtree-sitter.so.0.0       [.] 
ts_tree_cursor_goto_first_child            ▒
    9.55%  emacs         emacs                       [.] 
process_mark_stack                         ▒
    4.56%  emacs         libtree-sitter.so.0.0       [.] 
ts_node_start_point                        ▒
    3.90%  emacs         libtree-sitter.so.0.0       [.] 
ts_tree_cursor_parent_node                 ▒
    3.69%  emacs         emacs                       [.] 
re_match_2_internal                        ▒
    3.08%  emacs         libtree-sitter.so.0.0       [.] 
ts_language_symbol_metadata                ▒
    1.61%  emacs         emacs                       [.] exec_byte_code 
                            ▒
    1.47%  emacs         libtree-sitter.so.0.0       [.] 
ts_node_end_point                          ▒
    1.44%  emacs         libtree-sitter.so.0.0       [.] 
ts_tree_cursor_current_node                ▒
    1.13%  emacs         emacs                       [.] 
allocate_vectorlike                        ▒
    1.11%  emacs         emacs                       [.] sweep_strings 
                            ▒
    1.04%  emacs         libtree-sitter.so.0.0       [.] 
ts_node_end_byte                           ▒
    0.94%  emacs         emacs                       [.] next_interval 
                            ▒
    0.91%  emacs         libtree-sitter.so.0.0       [.] 
ts_tree_cursor_goto_parent                 ▒
    0.88%  emacs         emacs                       [.] 
lookup_char_property                       ▒
    0.81%  emacs         emacs                       [.] find_interval 
                            ▒
    0.68%  emacs         emacs                       [.] 
pdumper_marked_p_impl                      ▒
    0.67%  emacs         emacs                       [.] assq_no_quit 
                            ▒
    0.56%  emacs         libtree-sitter.so.0.0       [.] ts_node_symbol 
                            ▒
    0.56%  emacs         emacs                       [.] mark_char_table 
                            ▒
    0.55%  emacs         emacs                       [.] execute_charset 
                            ▒
    0.49%  emacs         libtree-sitter.so.0.0       [.] 
0x000000000001ae3e                         ▒
    0.49%  emacs         emacs                       [.] re_search_2 
                            ▒
    0.48%  emacs         emacs                       [.] funcall_subr 
                            ▒
    0.46%  emacs         libc.so.6                   [.] __strncmp_sse42 
                            ▒
    0.42%  emacs         libtree-sitter.so.0.0       [.] 
ts_language_public_symbol                  ▒
    0.41%  emacs         libtree-sitter.so.0.0       [.] 
ts_node_is_named                           ▒
    0.40%  emacs         libtree-sitter.so.0.0       [.] ts_node_new 
                            ▒
    0.34%  emacs         emacs                       [.] Fassq 
                            ▒
    0.34%  emacs         emacs                       [.] sweep_vectors

next prev parent reply	other threads:[~2023-03-23  0:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-22  4:49 bug#62368: 29.0.60; Evaluating predicates before creating captured nodes in treesit-query-capture Yuan Fu
2023-03-23  0:42 ` Dmitry Gutov [this message]
2023-03-23  3:16   ` Yuan Fu
2023-09-12  0:05     ` Stefan Kangas
2023-09-12  0:37       ` Yuan Fu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ad696668-a124-2bd5-415f-e438ee9f3526@yandex.ru \
    --to=dgutov@yandex.ru \
    --cc=62368@debbugs.gnu.org \
    --cc=casouri@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.