all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Limits of multiline font-lock
@ 2019-09-14 17:07 Michael Heerdegen
  2019-09-15 12:28 ` Stefan Monnier
  2019-09-18 21:13 ` Adam Porter
  0 siblings, 2 replies; 10+ messages in thread
From: Michael Heerdegen @ 2019-09-14 17:07 UTC (permalink / raw)
  To: Emacs Development

Hello,

I want to provide a hi-lock like feature for el-search patterns: an on
the fly highlighting of any expressions matching a certain el-search
pattern.  Elisp expressions can be multiline, of course.  After reading what
the manual says about multiline font lock I'm not sure if I can use
font-lock for that.

My use case is a bit different from the existing cases because I don't
need the multiline font-lock to implement a major mode.  So ideally I
don't want to mess with buffer local font-lock variables (like
`font-lock-extend-region-functions').

I noticed that it seems to be allowed in a font-lock function
(lambda (end) ...) to look backwards, and attach the font-lock-multiline
property to text that extends to text before the font-lock search
start.  Is this correct?

I also noticed that highlighting of strings already works with something
called syntactical matching or so, so what I need seems to be already
existing but it also seems that there are no Lisp functions to reuse
this stuff.

Before I reinvent the wheel or invest unnecessary amounts of time: is it
possible to base el-search-hi-lock on font-lock?  How would I ideally
approach?

TIA,

Michael.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Limits of multiline font-lock
  2019-09-14 17:07 Limits of multiline font-lock Michael Heerdegen
@ 2019-09-15 12:28 ` Stefan Monnier
  2019-09-15 23:13   ` Michael Heerdegen
  2019-09-18 21:13 ` Adam Porter
  1 sibling, 1 reply; 10+ messages in thread
From: Stefan Monnier @ 2019-09-15 12:28 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Emacs Development

> I want to provide a hi-lock like feature for el-search patterns: an on
> the fly highlighting of any expressions matching a certain el-search
> pattern.  Elisp expressions can be multiline, of course.  After reading what
> the manual says about multiline font lock I'm not sure if I can use
> font-lock for that.

I think before deciding which tools to use, I think you need to figure
out how you want to solve the fundamental difficulty:

- we're inside an arbitrary buffer with your new el-search-hi-lock
  functionality enabled for a particular pattern.
- let's assume for now that currently there's no match anywhere.
- then the user makes an edit at line N which makes the pattern match
  on lines M to M' (M < N and M' > N).

How do you plan on finding this match?

- You can scan the whole buffer from point-min, but that might be too
  costly (tho maybe it's OK if you do it from an idle timer).
- Or you can add limits like "we'll only look for it between N-5 and
  N+5" and we don't care if we don't notice new "hard to find" matches.
- Or you can rely on the earlier scans having noted that the text between M and N
  matches the beginning of the pattern and hence only need to check whether,
  after the edit, the new text starting at N matches "the rest of the
  pattern".
- ...


        Stefan




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Limits of multiline font-lock
  2019-09-15 12:28 ` Stefan Monnier
@ 2019-09-15 23:13   ` Michael Heerdegen
  2019-09-16 19:00     ` Stefan Monnier
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Heerdegen @ 2019-09-15 23:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Emacs Development

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> I think before deciding which tools to use, I think you need to figure
> out how you want to solve the fundamental difficulty:
>
> - we're inside an arbitrary buffer with your new el-search-hi-lock
>   functionality enabled for a particular pattern.

Must be an emacs-lisp buffer, of course.

> - let's assume for now that currently there's no match anywhere.
> - then the user makes an edit at line N which makes the pattern match
>   on lines M to M' (M < N and M' > N).
>
> How do you plan on finding this match?

Matches are always limited by bounds of the current top-level
expression.  A complete re-search from the beg-of-defun of window-start
up to window-end after a change is sufficient and doable in acceptable
time (in your concrete scenario, I could even restrict the search to all
parent sexps the edited text is in - most of the time these will no ever
be more than 20 or so...these can be tested very quickly).

I already have a prototype (not based on font-lock), and it starts
refontification only after a (tiny) idle time, and the search function
is interruptable (via throw-on-input).  When interrupted, the old
visible state is restored.

This works quite nicely and feels quite natural unless the search
pattern is very costly (then I currently emit a warning - the pattern
could also remove itself from the list or turn the minor mode off).

The tricky rest is now stuff that font-lock already does well.  My
current implementation has problems with cases like when different
(maybe overlapping) parts of a buffer are visible in different windows,
there is a certain risk of infinite retriggering of hi-locking etc.

Thanks,

Michael.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Limits of multiline font-lock
  2019-09-15 23:13   ` Michael Heerdegen
@ 2019-09-16 19:00     ` Stefan Monnier
  2019-09-18  3:23       ` Michael Heerdegen
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Monnier @ 2019-09-16 19:00 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Emacs Development

> Matches are always limited by bounds of the current top-level
> expression.  A complete re-search from the beg-of-defun of window-start
> up to window-end after a change is sufficient and doable in acceptable
> time (in your concrete scenario, I could even restrict the search to all
> parent sexps the edited text is in - most of the time these will no ever
> be more than 20 or so...these can be tested very quickly).

Good.

> I already have a prototype (not based on font-lock), and it starts
> refontification only after a (tiny) idle time, and the search function
> is interruptable (via throw-on-input).  When interrupted, the old
> visible state is restored.
>
> This works quite nicely and feels quite natural unless the search
> pattern is very costly (then I currently emit a warning - the pattern
> could also remove itself from the list or turn the minor mode off).

This makes it sound like you don't want to do it "synchronously" like
font-lock, but rather asynchronously (from a timer).  I'd tend to agree
(tho arguably, font-lock should also be done asynchronously ;-).

> The tricky rest is now stuff that font-lock already does well.  My
> current implementation has problems with cases like when different
> (maybe overlapping) parts of a buffer are visible in different windows,
> there is a certain risk of infinite retriggering of hi-locking etc.

I think if you use jit-lock-register to be told about areas that need to
be (re)searched, keep them in a side-data-structure (or as
text-properties), and then in the timer you simply process this
side-data-structure, you should naturally avoid infinite retriggering of
hi-locking (assuming you're using either overlays for the actual
highlight, or you're using with-silent-modifications to avoid
needlessly retriggering jit-lock).


        Stefan




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Limits of multiline font-lock
  2019-09-16 19:00     ` Stefan Monnier
@ 2019-09-18  3:23       ` Michael Heerdegen
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Heerdegen @ 2019-09-18  3:23 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Emacs Development

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> I think if you use jit-lock-register to be told about areas that need to
> be (re)searched, keep them in a side-data-structure (or as
> text-properties), and then in the timer you simply process this
> side-data-structure, you should naturally avoid infinite retriggering of
> hi-locking (assuming you're using either overlays for the actual
> highlight, or you're using with-silent-modifications to avoid
> needlessly retriggering jit-lock).

I think I will do exactly that!  For some reason I forgot about
jit-lock-register.

Thanks,

Michael.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Limits of multiline font-lock
  2019-09-14 17:07 Limits of multiline font-lock Michael Heerdegen
  2019-09-15 12:28 ` Stefan Monnier
@ 2019-09-18 21:13 ` Adam Porter
  2019-09-19  2:05   ` Michael Heerdegen
  1 sibling, 1 reply; 10+ messages in thread
From: Adam Porter @ 2019-09-18 21:13 UTC (permalink / raw)
  To: emacs-devel

Michael Heerdegen <michael_heerdegen@web.de> writes:

> Hello,
>
> I want to provide a hi-lock like feature for el-search patterns: an on
> the fly highlighting of any expressions matching a certain el-search
> pattern.  Elisp expressions can be multiline, of course.  After reading what
> the manual says about multiline font lock I'm not sure if I can use
> font-lock for that.
>
> My use case is a bit different from the existing cases because I don't
> need the multiline font-lock to implement a major mode.  So ideally I
> don't want to mess with buffer local font-lock variables (like
> `font-lock-extend-region-functions').
>
> I noticed that it seems to be allowed in a font-lock function
> (lambda (end) ...) to look backwards, and attach the font-lock-multiline
> property to text that extends to text before the font-lock search
> start.  Is this correct?
>
> I also noticed that highlighting of strings already works with something
> called syntactical matching or so, so what I need seems to be already
> existing but it also seems that there are no Lisp functions to reuse
> this stuff.
>
> Before I reinvent the wheel or invest unnecessary amounts of time: is it
> possible to base el-search-hi-lock on font-lock?  How would I ideally
> approach?
>
> TIA,
>
> Michael.

Hi Michael,

You might be interested in this package I published recently.  It
implements depth-based syntax highlighting for Lisp and some other
languages.

https://github.com/alphapapa/prism.el

I had to deal with similar issues about multiline font-locking.  After
reading the manual section about it a few times, I managed to come up
with a solution that works fairly well, although I'm sure it's quite
primitive: I add a function to font-lock-extend-region-functions which
extends the font-lock region forward and backwards before the matching
function is called.  I don't know if it's the optimal way to do it--the
manual mentioned that there are a few ways--but it seems to work.

However, I have discovered a performance issue in the case of sexps that
span large portions of the buffer (e.g. in my init files, I have some
large use-package forms that contain many functions and span hundreds of
lines).  If I could solve that, it would be great, but it works fine for
most code.

Please let me know if you have any suggestions.  Sometimes font-locking
feels like an arcane art.  :)

Thanks,
Adam




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Limits of multiline font-lock
  2019-09-18 21:13 ` Adam Porter
@ 2019-09-19  2:05   ` Michael Heerdegen
  2019-09-19  2:36     ` Adam Porter
  2023-10-07  7:30     ` Adam Porter
  0 siblings, 2 replies; 10+ messages in thread
From: Michael Heerdegen @ 2019-09-19  2:05 UTC (permalink / raw)
  To: Adam Porter; +Cc: emacs-devel

Adam Porter <adam@alphapapa.net> writes:

> You might be interested in this package I published recently.  It
> implements depth-based syntax highlighting for Lisp and some other
> languages.
>
> https://github.com/alphapapa/prism.el

Nice.  Could it go to Gnu Elpa?

> I had to deal with similar issues about multiline font-locking.  After
> reading the manual section about it a few times, I managed to come up
> with a solution that works fairly well, although I'm sure it's quite
> primitive: I add a function to font-lock-extend-region-functions which
> extends the font-lock region forward and backwards before the matching
> function is called.  I don't know if it's the optimal way to do it--the
> manual mentioned that there are a few ways--but it seems to work.
>
> However, I have discovered a performance issue in the case of sexps that
> span large portions of the buffer (e.g. in my init files, I have some
> large use-package forms that contain many functions and span hundreds of
> lines).  If I could solve that, it would be great, but it works fine for
> most code.

If you use `font-lock-extend-region-functions', all of font-lock uses
the extended region, right?  I guess basing your functionality on
jit-lock-register could be better.  If finding the beginning-of-defun
and identifying the levels is what causes the main cost, it wouldn't
help much, however.

My use case is a bit simpler since I only have to deal with Lisp.  What
modes does prism support btw?  What are reasons why some languages are
not supported?

> Please let me know if you have any suggestions.  Sometimes font-locking
> feels like an arcane art.  :)

Ok, I've not come that far yet ;-)


Michael.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Limits of multiline font-lock
  2019-09-19  2:05   ` Michael Heerdegen
@ 2019-09-19  2:36     ` Adam Porter
  2023-10-07  7:30     ` Adam Porter
  1 sibling, 0 replies; 10+ messages in thread
From: Adam Porter @ 2019-09-19  2:36 UTC (permalink / raw)
  To: emacs-devel

Michael Heerdegen <michael_heerdegen@web.de> writes:

> Adam Porter <adam@alphapapa.net> writes:
>
>> You might be interested in this package I published recently.  It
>> implements depth-based syntax highlighting for Lisp and some other
>> languages.
>>
>> https://github.com/alphapapa/prism.el
>
> Nice.  Could it go to Gnu Elpa?

Well, I have done the CA, but none of my packages are in ELPA yet, just
in my GitHub repos and MELPA.  I'm not necessarily opposed to putting
Prism in ELPA someday, but it probably needs to be more mature before
considering that.

> If you use `font-lock-extend-region-functions', all of font-lock uses
> the extended region, right?

Ah, that hadn't occurred to me!  Thanks!

> I guess basing your functionality on jit-lock-register could be
> better.

Thanks, I'll have to look into that.

> If finding the beginning-of-defun and identifying the levels
> is what causes the main cost, it wouldn't help much, however.

I think the problem, when it happens, is that too much code is being
re-fontified when the buffer changes, which I think is due to the region
extension.  It's only noticable on very large sexps.  It occurred to me
that, for Lisp especially, extending the region might not be necessary,
because syntax-ppss determines the correct depth regardless.  I have a
WIP branch that does some refactoring along those lines, but it needs
more work.

> My use case is a bit simpler since I only have to deal with Lisp.  What
> modes does prism support btw?  What are reasons why some languages are
> not supported?

Well, the documentation covers most of that.  There's a mode for Lisp
and C-style languages, and a prism-whitespace-mode for
significant-whitespace languages like Python and Shell (in which depth
is determined by both indentation and paren-type characters).  See the
screenshots for examples.  :)

I haven't determined that any particular languages are not supported.
The modes can be activated in any buffer.  Whether prism-mode or
prism-whitespace-mode work effectively is, AFAIK, determined by the
major mode's syntax tables.  I'd be glad if more users could report
whether it works with other languages that I don't use.

>> Please let me know if you have any suggestions.  Sometimes font-locking
>> feels like an arcane art.  :)
>
> Ok, I've not come that far yet ;-)

You've already helped me!  :)




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Limits of multiline font-lock
  2019-09-19  2:05   ` Michael Heerdegen
  2019-09-19  2:36     ` Adam Porter
@ 2023-10-07  7:30     ` Adam Porter
  2023-10-14  4:06       ` Michael Heerdegen
  1 sibling, 1 reply; 10+ messages in thread
From: Adam Porter @ 2023-10-07  7:30 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: emacs-devel

Hi Michael,

Please forgive the "blast from the past"; going through some old mail 
from the list I saw this message of yours that I missed.

On 9/18/19 21:05, Michael Heerdegen wrote:
> Adam Porter <adam@alphapapa.net> writes:
> 
>> You might be interested in this package I published recently.  It
>> implements depth-based syntax highlighting for Lisp and some other
>> languages.
>>
>> https://github.com/alphapapa/prism.el
> 
> Nice.  Could it go to Gnu Elpa?

Yes, I think I will submit it to GNU ELPA one of these days.  It's more 
mature now than it was then.

>> However, I have discovered a performance issue in the case of sexps that
>> span large portions of the buffer (e.g. in my init files, I have some
>> large use-package forms that contain many functions and span hundreds of
>> lines).  If I could solve that, it would be great, but it works fine for
>> most code.
> 
> If you use `font-lock-extend-region-functions', all of font-lock uses
> the extended region, right?  I guess basing your functionality on
> jit-lock-register could be better.  If finding the beginning-of-defun
> and identifying the levels is what causes the main cost, it wouldn't
> help much, however.

IIRC that performance issue turned out to be a bug elsewhere in the 
code; once solved, the issue with large forms spanning many lines was no 
longer a problem.

> My use case is a bit simpler since I only have to deal with Lisp.  What
> modes does prism support btw?  What are reasons why some languages are
> not supported?

Prism has two modes, one for whitespace-significant languages and one 
for all others.  It seems to generally work well, especially since some 
recent bug fixes.

The liability, to the extent that there is one, is that syntax tables 
can affect how delimiters and comments are detected, and some major 
modes may not use them in a way that makes such detection possible, e.g. 
using Emacs regexps' syntax types and syntax-ppss parsing.

Now that treesitter is in Emacs, I'm guessing that it might be helpful 
as a backend for some languages, so I may look into that in the future.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Limits of multiline font-lock
  2023-10-07  7:30     ` Adam Porter
@ 2023-10-14  4:06       ` Michael Heerdegen
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Heerdegen @ 2023-10-14  4:06 UTC (permalink / raw)
  To: Adam Porter; +Cc: emacs-devel

Adam Porter <adam@alphapapa.net> writes:

> Please forgive the "blast from the past"; going through some old mail
> from the list I saw this message of yours that I missed.

Thank you for letting me know.

> Yes, I think I will submit it to GNU ELPA one of these days.  It's
> more mature now than it was then.

It's there.  Installed.

> IIRC that performance issue turned out to be a bug elsewhere in the
> code; once solved, the issue with large forms spanning many lines was
> no longer a problem.

Ok, that's good.

> Now that treesitter is in Emacs, I'm guessing that it might be helpful
> as a backend for some languages, so I may look into that in the
> future.

I've never tried that treesitter stuff, I'm mostly using Elisp so I guess
it's not of much value for me.

Anyway, thanks for your reply, I'll definitely try out the new Elpa
version.


Regards,

Michael.



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-10-14  4:06 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-09-14 17:07 Limits of multiline font-lock Michael Heerdegen
2019-09-15 12:28 ` Stefan Monnier
2019-09-15 23:13   ` Michael Heerdegen
2019-09-16 19:00     ` Stefan Monnier
2019-09-18  3:23       ` Michael Heerdegen
2019-09-18 21:13 ` Adam Porter
2019-09-19  2:05   ` Michael Heerdegen
2019-09-19  2:36     ` Adam Porter
2023-10-07  7:30     ` Adam Porter
2023-10-14  4:06       ` Michael Heerdegen

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.