all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Dmitry Gutov <dgutov@yandex.ru>
To: Eli Zaretskii <eliz@gnu.org>
Cc: casouri@gmail.com, 60953@debbugs.gnu.org
Subject: bug#60953: The :match predicate with large regexp in tree-sitter font-lock seems inefficient
Date: Thu, 26 Jan 2023 01:21:08 +0200	[thread overview]
Message-ID: <01b5d074-fb12-6b1f-cbfb-5e759833b854@yandex.ru> (raw)
In-Reply-To: <83wn5ag4nc.fsf@gnu.org>

On 25/01/2023 14:49, Eli Zaretskii wrote:
>> Cc: 60953@debbugs.gnu.org
>> Date: Wed, 25 Jan 2023 05:48:13 +0200
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>>> We can probably match the regexp in-place, just limit the match to the range of the node.
>>
>> That's what I tried to do in the patch attached to the first message:
>> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60953#5
>>
>> But the effect on performance was surprisingly hard to notice. It also
>> broke the actual highlighting, but that's probably because the regexp
>> uses anchors \` and \', which don't really work for fast_looking_at
>> calls inside a buffer.
> 
> The condition for a match in that patch is not correct, AFAIU:
> 
>     if (val >= 0)
>       return true;
>     else
>       return false;
> 
> It should be "if (val > 0)" instead, since fast_looking_at returns the
> number of characters that matched (unlike fast_string_match in the
> original code, which returns the _index_ of the match).

Thank you. Unfortunately, the performance improvement from this patch is 
still fairly negligible.

Even though I got the highlighting to work -- by removing the \` and \' 
anchors from ruby-ts--builtin-methods (reducing the precision a little, 
but that's not important for the benchmark).

Switching to using :pred with function (like I did in commit 
d94dc606a0934) which still uses buffer-substring inside is significantly 
faster.

> Also, fast_string_match is capable of succeeding if the match begins
> not at the first character, whereas fast_looking_at does an anchored
> match.  Do we expect the text to match from its beginning in this
> case?  If not, I think the replacement didn't do what the original
> code does, and you should have used search_buffer or maybe
> search_buffer_re instead.

I suppose one could use a non-anchored regexp with :match, but that's 
not the case with the regexp I'm using currently.

Anyway, that's only going to be important if we find something that I 
missed here with this patch. Because otherwise the major bottleneck is 
somewhere else.

If we do end up using it and try to get it to 100% correctness, I 
suppose a combination of narrow-to-region (so that the \` and \' anchors 
work) with re-search-forward can do the trick.

Although I've tried using that combination inside 
ruby-ts--builtin-method-p (to avoid the buffer-substring call), and it 
wasn't much of an improvement in performance either.





  reply	other threads:[~2023-01-25 23:21 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-20  3:53 bug#60953: The :match predicate with large regexp in tree-sitter font-lock seems inefficient Dmitry Gutov
2023-01-24  4:04 ` Dmitry Gutov
2023-01-25  3:13   ` Yuan Fu
2023-01-25  3:48     ` Dmitry Gutov
2023-01-25 12:49       ` Eli Zaretskii
2023-01-25 23:21         ` Dmitry Gutov [this message]
2023-01-26  6:50           ` Eli Zaretskii
2023-01-26  7:17             ` Yuan Fu
2023-01-26  8:10               ` Eli Zaretskii
2023-01-26 17:15                 ` Dmitry Gutov
2023-01-26 18:24                   ` Eli Zaretskii
2023-01-26 19:35                     ` Dmitry Gutov
2023-01-26 20:01                       ` Eli Zaretskii
2023-01-26 21:26                         ` Dmitry Gutov
2023-01-30  0:49                           ` Dmitry Gutov
2023-01-30 14:06                             ` Eli Zaretskii
2023-01-30 14:47                               ` Dmitry Gutov
2023-01-30 15:08                                 ` Eli Zaretskii
2023-01-30 17:15                                   ` Dmitry Gutov
2023-01-30 17:49                                     ` Eli Zaretskii
2023-01-30 18:20                                       ` Dmitry Gutov
2023-01-30 18:42                                         ` Eli Zaretskii
2023-01-30 19:01                                           ` Dmitry Gutov
2023-01-30 19:05                                             ` Eli Zaretskii
2023-01-30 19:58                                               ` Dmitry Gutov
2023-01-30 23:57                                                 ` Yuan Fu
2023-01-31  0:44                                                   ` Dmitry Gutov
2023-01-31  3:23                                                 ` Eli Zaretskii
2023-01-31 18:16                                                   ` Dmitry Gutov
2023-02-01  2:39                                                     ` Dmitry Gutov
2023-02-01 13:39                                                       ` Eli Zaretskii
2023-02-01 15:13                                                         ` Dmitry Gutov
2023-02-01 21:20                                                         ` Dmitry Gutov
2023-02-02  2:16                                                           ` Yuan Fu
2023-02-02  6:34                                                           ` Eli Zaretskii
2023-02-02 12:12                                                             ` Dmitry Gutov
2023-02-02 14:23                                                               ` Eli Zaretskii
2023-02-02 17:03                                                                 ` Dmitry Gutov
2023-02-02 17:26                                                                   ` Eli Zaretskii
2023-02-02 17:53                                                                     ` Dmitry Gutov
2023-02-02 18:03                                                                       ` Eli Zaretskii
2023-02-02 19:44                                                                         ` Dmitry Gutov
2023-02-01 13:10                                                     ` Eli Zaretskii
2023-02-01 15:15                                                       ` Dmitry Gutov
2023-01-26 17:12               ` Dmitry Gutov
2023-01-26 18:07             ` Dmitry Gutov
2023-01-26 20:46               ` Dmitry Gutov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01b5d074-fb12-6b1f-cbfb-5e759833b854@yandex.ru \
    --to=dgutov@yandex.ru \
    --cc=60953@debbugs.gnu.org \
    --cc=casouri@gmail.com \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.