From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#62368: 29.0.60; Evaluating predicates before creating captured nodes in treesit-query-capture Date: Thu, 23 Mar 2023 02:42:20 +0200 Message-ID: References: <09A7EAB7-332E-4123-A6DB-8921FBD325C4@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="16870"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 To: Yuan Fu , 62368@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Mar 23 01:43:33 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pf93E-0004CS-SR for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 23 Mar 2023 01:43:32 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pf92m-00020t-QM; Wed, 22 Mar 2023 20:43:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pf92k-00020U-TN for bug-gnu-emacs@gnu.org; Wed, 22 Mar 2023 20:43:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pf92k-0007gZ-HT for bug-gnu-emacs@gnu.org; Wed, 22 Mar 2023 20:43:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pf92k-0001lO-1g for bug-gnu-emacs@gnu.org; Wed, 22 Mar 2023 20:43:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 23 Mar 2023 00:43:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62368 X-GNU-PR-Package: emacs Original-Received: via spool by 62368-submit@debbugs.gnu.org id=B62368.16795321526735 (code B ref 62368); Thu, 23 Mar 2023 00:43:02 +0000 Original-Received: (at 62368) by debbugs.gnu.org; 23 Mar 2023 00:42:32 +0000 Original-Received: from localhost ([127.0.0.1]:36909 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pf92G-0001kY-7a for submit@debbugs.gnu.org; Wed, 22 Mar 2023 20:42:32 -0400 Original-Received: from mail-ed1-f49.google.com ([209.85.208.49]:44826) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pf92E-0001kJ-0U for 62368@debbugs.gnu.org; Wed, 22 Mar 2023 20:42:30 -0400 Original-Received: by mail-ed1-f49.google.com with SMTP id eh3so80048684edb.11 for <62368@debbugs.gnu.org>; Wed, 22 Mar 2023 17:42:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679532143; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc:subject:date:message-id:reply-to; bh=MrZAqbFSjXdXi1qSTk8u5uJJ/4fd+c/ptbJWYU7b10c=; b=HHzmDpsLDyQBFeniRaCrq9W9X5sBbriTrIdvKcv3JaMx50b3ksNimIBD+LRrEjHBxN AoopoOE5/kgLSOW+0G/zIXSBbtMEFqBsL5zd/iE0GeeWP1qEnUF5ryW1PPvuF/lw2YwC SY6lDX2AWrNfd6MSmWaiZNrZRYw1jQIGhqhevKeXQnHMemZn6W3UvD/Gbf2jTdc2Xbej +0lpnVftv+qJ2kk5ZcPE6OsMyULl7udBjRRQr/MVJbgw1zisuVJVWh2EsRkV+q/BwJvW 8llizEFlT13CyjbgH310l+iBtBWe4jBcR3leUaVaAGtZRWmZMMB4/lkLyu5nOgkzFXwy FB+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679532143; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=MrZAqbFSjXdXi1qSTk8u5uJJ/4fd+c/ptbJWYU7b10c=; b=FM6nHmxKYie02GxhC+L6454cx/3WZz/2zlm1qLYUADzaxVW0Dud4V1RHT6gtNTbai7 LQpneD3VUzEq5fNDGuOFWrsqmuf1iUIpZmunRuKDqEq2pqZ18QyIsLFP1S5cry1QhELU LFsnOu6cIvnc/fdhH4F5EVC9yAwY2cRiBo02gGwoRqg7jNSPLq0nRTSOk0DPzN6Sh4uA swEYrfefFs63E+Fsv7DKUCHdd/ThSHCYXZEiM2zyPTcuM3ZyzNWVDoS/Pa6zyBS3kDVZ hx0OZlEzEGPHXIDNfVbSOkJxEsyA0vQRiElFo87CbVuxNb8GQqPxt39a4ATkQQlw4vrx WzNw== X-Gm-Message-State: AO0yUKUiHsEPbEkeVEGVIrc/9XPriN+GvCDaYn+581p9aOT2s6wXsC8F tee1lFTGXGgePNd2hwWt8Fs= X-Google-Smtp-Source: AK7set+mxsYSmAjDcaJO7+4Vkhj3u+Gtq18VpMKRtgrVuutWSZyO/w30vtaXqczoBZPxq0y/9PsWcg== X-Received: by 2002:a17:906:2009:b0:92e:d6e6:f3ad with SMTP id 9-20020a170906200900b0092ed6e6f3admr9328299ejo.6.1679532143059; Wed, 22 Mar 2023 17:42:23 -0700 (PDT) Original-Received: from [192.168.0.2] ([85.132.229.92]) by smtp.googlemail.com with ESMTPSA id jx3-20020a170907760300b00930876176e2sm7936940ejc.29.2023.03.22.17.42.22 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Mar 2023 17:42:22 -0700 (PDT) Content-Language: en-US In-Reply-To: <09A7EAB7-332E-4123-A6DB-8921FBD325C4@gmail.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:258418 Archived-At: Hi Yuan! On 22/03/2023 06:49, Yuan Fu wrote: > X-Debbugs-CC:dgutov@yandex.ru > > Dmitry, when you have time, could you try your benchmark in bug#60953 > with this patch? I made predicates evaluate before we create any nodes, > so #equal and #match should be more efficient now, when there are a lot > of rejections. In the same time #pred is made slightly worst since they > now create a lisp node and discard it. (But this can be fixed with a > little more complexity.) Thank you, I was curious what would the improvement be if we could delay allocation of node structures until :match is checked. But for my benchmark the difference is on the order of 4-5%. It seems we are scraping the barrel in terms of improving allocations/reducing GC because according to 'benchmark-run', where the whole run of a 100 iterations of the scenario takes ~1.1s, the time spent in GC is 0.150s. And the improved version takes like 1.04s, with 0.1s in GC. So if you ask me, I think I'd prefer to hold off on applying this patch until we either find scenarios where the improvement is more significant, or we find and eliminate some other bigger bottleneck first, after which these 5% grow to become 10-20% or more, of remaining runtime. The current approach is pretty Lisp-y, so I kind of like it. And there's the issue of #pred, of course, which which could swing the difference in the other direction (I didn't test any code which uses it). We could also try a smaller change: where the initial list of conses for result is build with capture_id's in car's, and then substituted with capture_name if the predicates all match. Then tthe treesit_node pseudovectors would still be created eagerly, though. Here's the current perf report for my benchmark, most of the time is spent in libtree-sitter: 17.02% emacs libtree-sitter.so.0.0 [.] ts_tree_cursor_current_status ◆ 10.94% emacs libtree-sitter.so.0.0 [.] ts_tree_cursor_goto_next_sibling ▒ 9.93% emacs libtree-sitter.so.0.0 [.] ts_tree_cursor_goto_first_child ▒ 9.55% emacs emacs [.] process_mark_stack ▒ 4.56% emacs libtree-sitter.so.0.0 [.] ts_node_start_point ▒ 3.90% emacs libtree-sitter.so.0.0 [.] ts_tree_cursor_parent_node ▒ 3.69% emacs emacs [.] re_match_2_internal ▒ 3.08% emacs libtree-sitter.so.0.0 [.] ts_language_symbol_metadata ▒ 1.61% emacs emacs [.] exec_byte_code ▒ 1.47% emacs libtree-sitter.so.0.0 [.] ts_node_end_point ▒ 1.44% emacs libtree-sitter.so.0.0 [.] ts_tree_cursor_current_node ▒ 1.13% emacs emacs [.] allocate_vectorlike ▒ 1.11% emacs emacs [.] sweep_strings ▒ 1.04% emacs libtree-sitter.so.0.0 [.] ts_node_end_byte ▒ 0.94% emacs emacs [.] next_interval ▒ 0.91% emacs libtree-sitter.so.0.0 [.] ts_tree_cursor_goto_parent ▒ 0.88% emacs emacs [.] lookup_char_property ▒ 0.81% emacs emacs [.] find_interval ▒ 0.68% emacs emacs [.] pdumper_marked_p_impl ▒ 0.67% emacs emacs [.] assq_no_quit ▒ 0.56% emacs libtree-sitter.so.0.0 [.] ts_node_symbol ▒ 0.56% emacs emacs [.] mark_char_table ▒ 0.55% emacs emacs [.] execute_charset ▒ 0.49% emacs libtree-sitter.so.0.0 [.] 0x000000000001ae3e ▒ 0.49% emacs emacs [.] re_search_2 ▒ 0.48% emacs emacs [.] funcall_subr ▒ 0.46% emacs libc.so.6 [.] __strncmp_sse42 ▒ 0.42% emacs libtree-sitter.so.0.0 [.] ts_language_public_symbol ▒ 0.41% emacs libtree-sitter.so.0.0 [.] ts_node_is_named ▒ 0.40% emacs libtree-sitter.so.0.0 [.] ts_node_new ▒ 0.34% emacs emacs [.] Fassq ▒ 0.34% emacs emacs [.] sweep_vectors