From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#60953: The :match predicate with large regexp in tree-sitter font-lock seems inefficient Date: Thu, 26 Jan 2023 22:46:30 +0200 Message-ID: <1d7aaf56-6130-c0f0-446f-4bc2c5cafa28@yandex.ru> References: <7624dddc-4600-9a03-ac8b-d3c9e0ab618c@yandex.ru> <04729838-b7d4-8a08-2b71-12536a28aebb@yandex.ru> <83wn5ag4nc.fsf@gnu.org> <01b5d074-fb12-6b1f-cbfb-5e759833b854@yandex.ru> <838rhpg57n.fsf@gnu.org> <31559c1f-1a12-691d-3d03-f566019a0aab@yandex.ru> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------sHXGqMYlVXefA1o6sqxFpSea" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23596"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Cc: casouri@gmail.com, 60953@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Jan 26 21:47:35 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pL99j-0005xU-8l for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 26 Jan 2023 21:47:35 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pL99G-0007RC-FD; Thu, 26 Jan 2023 15:47:06 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pL99D-0007QW-II for bug-gnu-emacs@gnu.org; Thu, 26 Jan 2023 15:47:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pL99D-0007AB-AF for bug-gnu-emacs@gnu.org; Thu, 26 Jan 2023 15:47:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pL99D-00030O-4m for bug-gnu-emacs@gnu.org; Thu, 26 Jan 2023 15:47:03 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 26 Jan 2023 20:47:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 60953 X-GNU-PR-Package: emacs Original-Received: via spool by 60953-submit@debbugs.gnu.org id=B60953.167476600111510 (code B ref 60953); Thu, 26 Jan 2023 20:47:03 +0000 Original-Received: (at 60953) by debbugs.gnu.org; 26 Jan 2023 20:46:41 +0000 Original-Received: from localhost ([127.0.0.1]:36303 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pL98q-0002za-II for submit@debbugs.gnu.org; Thu, 26 Jan 2023 15:46:40 -0500 Original-Received: from mail-ej1-f41.google.com ([209.85.218.41]:33523) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pL98p-0002zO-Ja for 60953@debbugs.gnu.org; Thu, 26 Jan 2023 15:46:40 -0500 Original-Received: by mail-ej1-f41.google.com with SMTP id tz11so8597884ejc.0 for <60953@debbugs.gnu.org>; Thu, 26 Jan 2023 12:46:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:references:cc:to:from:content-language:subject :user-agent:mime-version:date:message-id:sender:from:to:cc:subject :date:message-id:reply-to; bh=rQazVJ8+UDAi+tFvCR2cvCHtaDjBGT0q5PiObKJfIuc=; b=NbR/5O8rQsr4Nbx/JFcD+FVsqq3zT2nwUA0qzBvvgEjUsKwwsHaV6X0u4O/FqhErp4 22vYNlfAl0ZLa/J6pwOhDik3Q8CuwoTBWw1B7x0dCCIgfz5fLI5VCT28gTcDdM+fzLrl lio9XdVPi5trbtH+0jggwPhrVjVQsKbIFs9U5X5za+tYAYlWvziXsUxA64Z+sPRZI43W 24kir6ArBnkTzDy+pp8bwYbiMMLohvKCsVCir7KNcQ8NY2yreJ9bjKE+vJrYKTx6HTGN dGJZcF8T/lF21vdMs1+5hcazPCQaTUy5LLgSGBTC+x4iWrT54phMY8aqQnD7kmDoIQBV 1hwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:references:cc:to:from:content-language:subject :user-agent:mime-version:date:message-id:sender:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=rQazVJ8+UDAi+tFvCR2cvCHtaDjBGT0q5PiObKJfIuc=; b=wKSpPq8bqK2W4z8iMT75V7QWl2wKMuzjvoyS8hrDdbMTMoh2XMmnCn0E887x+c33ff IDolJGgneX23LKa2by/8dCUBf50jugFJLtph8Oqh3zQq+MIHxeOKfRSGWs2kFmgdcmVM Ea8gVaU9SYqpMzzrfBlXqsBny3r6Obl7W3OUeClrzk19fJFDjov56UwFWYuMoXxdJqTG W1RUfR+dywHPMJwgObpzL8YQDmZayXzBJZnf822IOhW5s+cLpF+9tIPp1YmVXouaMW37 ui4XbKNFKoRa8/yl4AzT6YbgLNU1IyvXQV/ClvYDnvZXBTP+uT+2lqi85GZgE1oK+6Yh TG4A== X-Gm-Message-State: AO0yUKUOQArfnx4FifaIVyQnf2AnAOPhshRPuU4SwCRKlsQmKsmkKmMZ G4uKE+04Vut6ZRGE7ueR8ek= X-Google-Smtp-Source: AK7set+oOKNx/cs/16eCytpJ/YkfwgDfIuiQ0kher+QjqXUJoIvbuM8gDPeB36SLNHES6ofukM41zw== X-Received: by 2002:a17:906:d974:b0:878:7a0e:5730 with SMTP id rp20-20020a170906d97400b008787a0e5730mr2297979ejb.56.1674765993519; Thu, 26 Jan 2023 12:46:33 -0800 (PST) Original-Received: from [10.115.253.32] ([138.199.34.134]) by smtp.googlemail.com with ESMTPSA id e22-20020a17090658d600b0085214114218sm1089170ejs.185.2023.01.26.12.46.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 26 Jan 2023 12:46:32 -0800 (PST) Content-Language: en-US In-Reply-To: <31559c1f-1a12-691d-3d03-f566019a0aab@yandex.ru> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:254235 Archived-At: This is a multi-part message in MIME format. --------------sHXGqMYlVXefA1o6sqxFpSea Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 26/01/2023 20:07, Dmitry Gutov wrote: > One could hope to avoid recreating the list of predicates on every > match, but that seems to be a limitation of the TS API: > ts_query_predicates_for_pattern requires a second argument, > match.pattern_index. Maybe we could memoize that, though? Speaking of memoization, here is a POC patch. It's a definite improvement: with the attached :match almost reaches the performance of :pred. Not sure why it's still not faster, though. (I also tried a more comprehensive memoization using a hash table, the resulting performance was slightly worse.) --------------sHXGqMYlVXefA1o6sqxFpSea Content-Type: text/x-patch; charset=UTF-8; name="memoize_simple.diff" Content-Disposition: attachment; filename="memoize_simple.diff" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL3NyYy90cmVlc2l0LmMgYi9zcmMvdHJlZXNpdC5jCmluZGV4IDkxN2Ri NTgyNjc2Li42OWY1NDk3NjUwOSAxMDA2NDQKLS0tIGEvc3JjL3RyZWVzaXQuYworKysgYi9z cmMvdHJlZXNpdC5jCkBAIC0yNzIyLDYgKzI3MjIsNyBAQCBERUZVTiAoInRyZWVzaXQtcXVl cnktY2FwdHVyZSIsCiAgICAgIGJvdHRsZW5lY2sgKDk4LjQlIG9mIHRoZSBydW5uaW5nIHRp bWUgc3BlbnQgb24gbmNvbmMpLiAgKi8KICAgTGlzcF9PYmplY3QgcmVzdWx0ID0gUW5pbDsK ICAgTGlzcF9PYmplY3QgcHJldl9yZXN1bHQgPSByZXN1bHQ7CisgIExpc3BfT2JqZWN0IHBy ZWRpY2F0ZXNfZm9yXzAgPSBOVUxMOwogICB3aGlsZSAodHNfcXVlcnlfY3Vyc29yX25leHRf bWF0Y2ggKGN1cnNvciwgJm1hdGNoKSkKICAgICB7CiAgICAgICAvKiBSZWNvcmQgdGhlIGNo ZWNrcG9pbnQgdGhhdCB3ZSBtYXkgcm9sbCBiYWNrIHRvLiAgKi8KQEAgLTI3NTAsOSArMjc1 MSwxOCBAQCBERUZVTiAoInRyZWVzaXQtcXVlcnktY2FwdHVyZSIsCiAJICByZXN1bHQgPSBG Y29ucyAoY2FwLCByZXN1bHQpOwogCX0KICAgICAgIC8qIEdldCBwcmVkaWNhdGVzLiAgKi8K LSAgICAgIExpc3BfT2JqZWN0IHByZWRpY2F0ZXMKLQk9IHRyZWVzaXRfcHJlZGljYXRlc19m b3JfcGF0dGVybiAodHJlZXNpdF9xdWVyeSwKLQkJCQkJICBtYXRjaC5wYXR0ZXJuX2luZGV4 KTsKKyAgICAgIExpc3BfT2JqZWN0IHByZWRpY2F0ZXM7CisgICAgICBpZiAobWF0Y2gucGF0 dGVybl9pbmRleCA9PSAwKQorCXsKKwkgIGlmIChwcmVkaWNhdGVzX2Zvcl8wID09IE5VTEwp CisJICAgIHByZWRpY2F0ZXNfZm9yXzAgPSB0cmVlc2l0X3ByZWRpY2F0ZXNfZm9yX3BhdHRl cm4gKHRyZWVzaXRfcXVlcnksIDApOworCisJICBwcmVkaWNhdGVzID0gcHJlZGljYXRlc19m b3JfMDsKKwl9CisgICAgICBlbHNlCisJeworCSAgcHJlZGljYXRlcyA9IHRyZWVzaXRfcHJl ZGljYXRlc19mb3JfcGF0dGVybiAodHJlZXNpdF9xdWVyeSwgbWF0Y2gucGF0dGVybl9pbmRl eCk7CisJfQogCiAgICAgICAvKiBjYXB0dXJlc19saXNwID0gRm5yZXZlcnNlIChjYXB0dXJl c19saXNwKTsgKi8KICAgICAgIHN0cnVjdCBjYXB0dXJlX3JhbmdlIGNhcHR1cmVzX3Jhbmdl ID0geyByZXN1bHQsIHByZXZfcmVzdWx0IH07Cg== --------------sHXGqMYlVXefA1o6sqxFpSea--