bug#71345: Feature: unleash font-lock's secret weapon; handle Qfontified = non-nil

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

From: JD Smith <jdtsmith@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: dmitry@gutov.dev, 71345@debbugs.gnu.org
Subject: bug#71345: Feature: unleash font-lock's secret weapon; handle Qfontified = non-nil
Date: Tue, 4 Jun 2024 11:38:05 -0400	[thread overview]
Message-ID: <798B70AF-69BD-479E-992E-5CE9B4924820@gmail.com> (raw)
In-Reply-To: <jwvmso0951x.fsf-monnier+emacs@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 7056 bytes --]

> On Jun 4, 2024, at 10:15 AM, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>>>> That starts to sound like a lot of property slinging, which might even
>>>> dominate the work done.
>>> Indeed, this amount of work could become significant.  It's my main
>>> worry, but I don't have a clear feel for how serious it would be
>>> in practice.
>> In my situation, the most likely scenario is that fontified=nil is noticed
>> during redisplay when there is a fairly large stretch of already-fontified
>> property having the same value.  So jit-lock-fontify-now will quickly find
>> a nice large chunk to call my FONTIFICATION-FUNCTION=F-F with.

>> Since jit-lock-after-change will likely clear away already-fontified and set
>> fontified=nil, a single additional F-F on top of jit-lock-function will
>> probably be very well handled.  A good question is how it would scale with
>> more functions all operating in the same region.  One idea is to rig up
>> a test file, do some fake jit-lock-flushing on it, and check performance of
>> just subtracting/searching/dividing the already-fontified property as you
>> add more (fake) F-F's.   For me, jit-lock-fontify-now of a 2500 char chunk
>> in a heavy treesitter buffer is in the 2-5ms range.  Individual F-F's could
>> be much lighter weight.
> 
> I must say that I can't follow you.  I suspect we're not talking about
> quite the same thing.  Could you clarify what is the costs you imagine
> could be significant?  What you compare it to?

Apologies for the lack of clarity.  Here I was revisiting the notion that "this amount of work could become significant."  I was trying to convey that the costs of i) applying the proposed jit-lock-already-fontified property (with subtraction, as in your original idea), and ii) parsing it into regions in jit-lock-fontify-now might in fact be fairly minimal, for my situation.  My situation = font-lock-fontify-region + my-special-fontify-region.  

In other words, for many cases there would in fact not be much property management work.  This leads naturally to considering more complicated cases, with several additional fontification functions all interoperating.  The property work will grow quickly (though I also outlined some ideas to keep it under control, which probably already occurred to you).

> You seem to be comparing "a single big jit-lock backend" vs "several
> jit-lock backends", which is a completely different worry from mine.

This is indeed the implicit comparison I'm making: 

the current situation of a single big backend which redoes EVERYTHING as potentially large regions are invalidated, with much of its work done unnecessarily vs. 
multiple backends used for more targeted & orthogonal updates, at the cost of additional property management in jit-lock.

As long as the additional property management costs are well below the savings you reap from not having repeated the unnecessary work, this would be a positive outcome.  The 2-5ms I mention is the cost for me of running "one large backend" over one chunk — namely font-lock-fontify-region with treesitter backing.  In my scenario of bar updates resulting from point motion, this represents purely wasted work.  So if the additional "property management" costs per chunk are, say, 100x below that, you are safely in "well worth it" territory.

> Splitting a backend into several backends comes with many more issues
> (such as the issue of fighting over which one controls which properties,
> or removing internal dependencies such that none of them needs to look
> at the properties set by the others, ...) but that seems largely
> orthogonal to the question at hand: if you want to be able to refresh
> the position-dependent highlighting separately from the rest of the
> highlighting you need that position-dependent highlighting to be
> independent anyway (e.g. you need to be able to remove it without
> affecting the position-independent highlighting).

Agreed that could be an issue.  In practice keyword-based fontification can lead to these same sorts of conflicts for non trivial FACE forms too.  So backends would need to ensure the changes they are making in the buffer are interoperable with the other likely backends (in particular font-lock).

This also raises the question of what should happen after-change.  In my view, that should wipe the slate fully clean in the changed region.  This means other backends would still need to add to font-lock-extra-managed-props any unusual properties they will apply (or do the equivalent on their own during unfontify).  And the order of backend registration would be significant, with the last one having "the final word".  Context re-fontification is a special case of this: some backends could ignore that, others would need to be re-run — something they'd have to decide by themselves. 

>> But things like `text-property-any' will be quickly defeated by the
>> combinatorics of a large F-F set.
> 
> `text-property-any` only tests `eq`ness so it works just as quickly with
> a property made up of a million-element list as with a property made of
> a boolean.
> 
> IOW, I again can't follow you.

I was referring to the number of such lists, not the speed of testing them.  Imagine a scenario as follows: 4 different backends are all operating over the same region — F (for normal font-lock), A, B, and C.  As various invalidation events occur and backends call jit-lock-flush, a given region of text may accumulate a patchwork of already-fontified lists (here assuming F always wipes the slate clean as it works, and therefore always appears on the already-fontified list):

'(F) '(F A) '(F B) '(F C) '(F A B)  '(F A C) '(F B C)  '(F A B C)

So jit-lock-fontify-now's job has gotten quite challenging, as it decides over what region to apply a particular backend, say A.  To know whether it can skip A, it must either look inside all the lists to see if there's an A, or it must look for lists `eq` to all possible combinations which contain A.  

It's possible you've already conceived of this and have a solution in mind; apologies if so.  My simple solution to this was to let the property values themselves constitute the list of already-done/pending backends.  Then it's much easier to ask "is A already fontified everywhere in this block"?

>> So here's an idea.  You could invert the logic, and have a set of
>> `fontified-pending' properties which jit-lock-flush adds to as it sets
>> fontified=nil,
> 
> Yes, of course, we could use the complement set.

The distinct idea here was to map each backend to an individual property, in place of the idea of a single property holding a list of already-done or pending backends, with the aim of significantly reducing property management costs.  That's really just an implementation detail though.  

I think your concern of backend priority and the related issue of how after-change and contextual refontification are handled is probably more important to sort out.

[-- Attachment #2: Type: text/html, Size: 24535 bytes --]

next prev parent reply	other threads:[~2024-06-04 15:38 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-03 16:35 bug#71345: Feature: unleash font-lock's secret weapon; handle Qfontified = non-nil JD Smith
2024-06-03 16:56 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-06-03 21:14   ` JD Smith
2024-06-04  1:44     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-06-04 12:08       ` JD Smith
2024-06-04 14:15         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-06-04 15:38           ` JD Smith [this message]
2024-06-04 21:52             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-06-04 22:41               ` JD Smith
2024-06-05 11:29                 ` Eli Zaretskii
2024-06-05 14:02                   ` JD Smith
2024-06-05 14:53                     ` Eli Zaretskii
2024-06-05 15:52                       ` JD Smith
2024-06-05 17:00                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-06-05 17:24                         ` Drew Adams via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-06-05 11:24               ` Eli Zaretskii
2024-06-05 14:05                 ` JD Smith
2024-06-05 16:28                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-06-05 16:38                   ` Eli Zaretskii
2024-06-05 16:59                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-06-05 17:52                       ` Eli Zaretskii
2024-06-05 18:13                         ` JD Smith
2024-06-07  3:27                         ` JD Smith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=798B70AF-69BD-479E-992E-5CE9B4924820@gmail.com \
    --to=jdtsmith@gmail.com \
    --cc=71345@debbugs.gnu.org \
    --cc=dmitry@gutov.dev \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).