* Font lock question
@ 2021-03-17 9:40 Joost Kremers
2021-03-17 13:39 ` Stefan Monnier
0 siblings, 1 reply; 10+ messages in thread
From: Joost Kremers @ 2021-03-17 9:40 UTC (permalink / raw)
To: help-gnu-emacs
Hi list,
I'm trying to add font lock rules to a package of mine and I'm running into an
issue that I can't figure out how to solve. Schematically, I'm trying to match
instances of the following type:
[aBc]
Here, the square brackets are obligatory and so is the `B`. `a` and `c` are
optional. `B` can be recognised by its form, `a` and `c` can only be recognised
by the fact that they are in between the opening/closing bracket and `B`.
So far, so good. But the complication is that this pattern of `aBc` can be
repeated *within the brackets*, where each instance is separated by a semicolon:
[aBc; aBc; aBc]
There is no (theoretical) limit to the number of repetitions.
I have no trouble composing a regex that matches this entire thing. I'd add a
shy group that includes an optional semicolon and that is repeated at least
once:
\[\(?:\(?1:.*?\)B\(?2:.*?\);?\)+\]
However, this does not work for font-lock, it seems.
At this point, I'm unsure what to try next. Is there a way to deal with such
patterns?
TIA
Joost
--
Joost Kremers
Life has its moments
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Font lock question
2021-03-17 9:40 Font lock question Joost Kremers
@ 2021-03-17 13:39 ` Stefan Monnier
2021-03-17 13:46 ` Joost Kremers
0 siblings, 1 reply; 10+ messages in thread
From: Stefan Monnier @ 2021-03-17 13:39 UTC (permalink / raw)
To: help-gnu-emacs
> I have no trouble composing a regex that matches this entire thing. I'd add a
> shy group that includes an optional semicolon and that is repeated at least
> once:
>
> \[\(?:\(?1:.*?\)B\(?2:.*?\);?\)+\]
>
> However, this does not work for font-lock, it seems.
Can you clarify what you mean by "doe snot work"?
Stefan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Font lock question
2021-03-17 13:39 ` Stefan Monnier
@ 2021-03-17 13:46 ` Joost Kremers
2021-03-17 15:33 ` Stefan Monnier
2021-03-17 16:31 ` Harald Jörg
0 siblings, 2 replies; 10+ messages in thread
From: Joost Kremers @ 2021-03-17 13:46 UTC (permalink / raw)
To: help-gnu-emacs
On Wed, Mar 17 2021, Stefan Monnier wrote:
>> I have no trouble composing a regex that matches this entire thing. I'd add a
>> shy group that includes an optional semicolon and that is repeated at least
>> once:
>>
>> \[\(?:\(?1:.*?\)B\(?2:.*?\);?\)+\]
>>
>> However, this does not work for font-lock, it seems.
>
> Can you clarify what you mean by "doe snot work"?
Of course. :-) What I mean is that in my test case, e.g., with the following
text in the buffer:
[aBc; aBc; aBc]
only the final `aBc` has the right font-lock faces applied to it. The first two
occurrences of `aBc` have no special face.
I'm testing with the following:
```
(font-lock-add-keywords nil
'(("\\[\\(?:\\(?1:[[:alnum:]]*?\\)\\(?2:B\\)\\(?3:[[:alnum:]]*?\\)[;[:blank:]]*?\\)+\\]"
(1 font-lock-warning-face)
(2 font-lock-keyword-face)
(3 font-lock-warning-face))))
```
This correctly fontifies [aBc] but not the sequence above, [aBc; aBc; aBc].
TIA
--
Joost Kremers
Life has its moments
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Font lock question
2021-03-17 13:46 ` Joost Kremers
@ 2021-03-17 15:33 ` Stefan Monnier
2021-03-17 16:09 ` Joost Kremers
2021-03-17 16:31 ` Harald Jörg
1 sibling, 1 reply; 10+ messages in thread
From: Stefan Monnier @ 2021-03-17 15:33 UTC (permalink / raw)
To: help-gnu-emacs
>>> I have no trouble composing a regex that matches this entire thing. I'd add a
>>> shy group that includes an optional semicolon and that is repeated at least
>>> once:
>>>
>>> \[\(?:\(?1:.*?\)B\(?2:.*?\);?\)+\]
>>>
>>> However, this does not work for font-lock, it seems.
>>
>> Can you clarify what you mean by "doe snot work"?
>
> Of course. :-) What I mean is that in my test case, e.g., with the following
> text in the buffer:
>
> [aBc; aBc; aBc]
>
> only the final `aBc` has the right font-lock faces applied to it. The first two
> occurrences of `aBc` have no special face.
Ah I see: Emacs's regular expression engine is like that of POSIX and
most other languages: it only keeps track of the last repetition of
a given subgroup in the match-data it returns.
> I'm testing with the following:
>
> ```
> (font-lock-add-keywords nil
> '(("\\[\\(?:\\(?1:[[:alnum:]]*?\\)\\(?2:B\\)\\(?3:[[:alnum:]]*?\\)[;[:blank:]]*?\\)+\\]"
> (1 font-lock-warning-face)
> (2 font-lock-keyword-face)
> (3 font-lock-warning-face))))
Indeed that won't do what you want. You're going to have to either
manually loop through all the matches of of aBc within the brackets, or
use font-lock-keywords's `MATCH-ANCHORED` to do that for you.
But be aware that these approaches still won't work well if the [...]
thing can span multiple lines.
Stefan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Font lock question
2021-03-17 15:33 ` Stefan Monnier
@ 2021-03-17 16:09 ` Joost Kremers
2021-03-17 16:46 ` Stefan Monnier
0 siblings, 1 reply; 10+ messages in thread
From: Joost Kremers @ 2021-03-17 16:09 UTC (permalink / raw)
To: Stefan Monnier; +Cc: help-gnu-emacs
On Wed, Mar 17 2021, Stefan Monnier wrote:
> Indeed that won't do what you want. You're going to have to either
> manually loop through all the matches of of aBc within the brackets, or
> use font-lock-keywords's `MATCH-ANCHORED` to do that for you.
Yes, I was afraid of that. I haven't quite figured out how that works... :D I'll
need to find some time to dive into font-lock a bit more.
> But be aware that these approaches still won't work well if the [...]
> thing can span multiple lines.
Hmm, I'm afraid that is indeed possible. Perhaps I'll keep it simple and only
font-lock the `B` part, since it can't contain any spaces and is fairly easy to
do.
Thanks for your reply.
--
Joost Kremers
Life has its moments
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Font lock question
2021-03-17 16:09 ` Joost Kremers
@ 2021-03-17 16:46 ` Stefan Monnier
2021-03-18 19:41 ` Joost Kremers
0 siblings, 1 reply; 10+ messages in thread
From: Stefan Monnier @ 2021-03-17 16:46 UTC (permalink / raw)
To: Joost Kremers; +Cc: help-gnu-emacs
>> But be aware that these approaches still won't work well if the [...]
>> thing can span multiple lines.
> Hmm, I'm afraid that is indeed possible. Perhaps I'll keep it simple and only
> font-lock the `B` part, since it can't contain any spaces and is fairly easy to
> do.
Can there be [...] without any `aBc`s in them and which need to be
font-lock differently?
If not (and if it shouldn't span too many lines), then I think
a MATCH-ANCHORED that just matches on the opening "[" and then uses the
PRE-MATCH-FORM to return the position of the matching "]" should work
acceptably, as long as you use `font-lock-multiline`.
Stefan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Font lock question
2021-03-17 16:46 ` Stefan Monnier
@ 2021-03-18 19:41 ` Joost Kremers
0 siblings, 0 replies; 10+ messages in thread
From: Joost Kremers @ 2021-03-18 19:41 UTC (permalink / raw)
To: help-gnu-emacs
On Wed, Mar 17 2021, Stefan Monnier wrote:
>>> But be aware that these approaches still won't work well if the [...]
>>> thing can span multiple lines.
>> Hmm, I'm afraid that is indeed possible. Perhaps I'll keep it simple and only
>> font-lock the `B` part, since it can't contain any spaces and is fairly easy to
>> do.
>
> Can there be [...] without any `aBc`s in them and which need to be
> font-lock differently?
There are. What I'm really trying to achieve is fontifying Pandoc-style
citations in Markdown files. There are several other Markdown elements that use
brackets, so perhaps Harald's suggestion to use the entire `[aBc` as an anchor
is probably less error-prone.
> If not (and if it shouldn't span too many lines), then I think
> a MATCH-ANCHORED that just matches on the opening "[" and then uses the
> PRE-MATCH-FORM to return the position of the matching "]" should work
> acceptably, as long as you use `font-lock-multiline`.
Even if they may occasionally span two lines, I assume citations will rarely
span more than that, so font-lock-multiline may be an option. (Esp. given that
`markdown-mode.el` does, as well.)
--
Joost Kremers
Life has its moments
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Font lock question
2021-03-17 13:46 ` Joost Kremers
2021-03-17 15:33 ` Stefan Monnier
@ 2021-03-17 16:31 ` Harald Jörg
2021-03-18 19:53 ` Joost Kremers
1 sibling, 1 reply; 10+ messages in thread
From: Harald Jörg @ 2021-03-17 16:31 UTC (permalink / raw)
To: Joost Kremers; +Cc: help-gnu-emacs
Joost Kremers <joostkremers@fastmail.fm> writes:
> On Wed, Mar 17 2021, Stefan Monnier wrote:
>> [...]
>> Can you clarify what you mean by "doe snot work"?
>
> Of course. :-) What I mean is that in my test case, e.g., with the following
> text in the buffer:
>
> [aBc; aBc; aBc]
>
> only the final `aBc` has the right font-lock faces applied to it. The first two
> occurrences of `aBc` have no special face.
>
> I'm testing with the following:
>
> ```
> (font-lock-add-keywords nil
> '(("\\[\\(?:\\(?1:[[:alnum:]]*?\\)\\(?2:B\\)\\(?3:[[:alnum:]]*?\\)[;[:blank:]]*?\\)+\\]"
> (1 font-lock-warning-face)
> (2 font-lock-keyword-face)
> (3 font-lock-warning-face))))
> ```
I had a similar case recently, and also failed to solve it with one
regular expression: After all, the match (including the closing "]")
succeeds only once, so there's no more than one capture group for 1, 2,
and 3 each.
My solution was to to combine a rule for [aBc] with a rule of type
(MATCHER . ANCHORED-HIGHLIGHTER): The anchor is the opening "[" and the
first "aBc", and the anchored matcher matches a semicolon and, as a
group, another "aBc".
I just saw that Stefan said this would work, so I'll give it a try:
```
(font-lock-add-keywords
nil
'(("\\[\\(?1:[[:alnum:]]*?\\)\\(?2:B\\)\\(?3:[[:alnum:]]*\\)[];[[:blank:]]*?"
(1 font-lock-warning-face)
(2 font-lock-keyword-face)
(3 font-lock-warning-face))
("[[:alnum:]]*B[[:alnum:]]*"
(";[[:blank:]]*\\(?1:[[:alnum:]]*\\)\\(?2:B\\)\\(?3:[[:alnum:]]*\\)"
nil nil
(1 font-lock-warning-face)
(2 font-lock-keyword-face)
(3 font-lock-warning-face)))))
```
--
Cheers,
haj
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Font lock question
2021-03-17 16:31 ` Harald Jörg
@ 2021-03-18 19:53 ` Joost Kremers
0 siblings, 0 replies; 10+ messages in thread
From: Joost Kremers @ 2021-03-18 19:53 UTC (permalink / raw)
To: help-gnu-emacs
On Wed, Mar 17 2021, Harald Jörg wrote:
> My solution was to to combine a rule for [aBc] with a rule of type
> (MATCHER . ANCHORED-HIGHLIGHTER): The anchor is the opening "[" and the
> first "aBc", and the anchored matcher matches a semicolon and, as a
> group, another "aBc".
Ah, OK. I think I understand that. :-)
> I just saw that Stefan said this would work, so I'll give it a try:
> ```
> (font-lock-add-keywords
> nil
> '(("\\[\\(?1:[[:alnum:]]*?\\)\\(?2:B\\)\\(?3:[[:alnum:]]*\\)[];[[:blank:]]*?"
> (1 font-lock-warning-face)
> (2 font-lock-keyword-face)
> (3 font-lock-warning-face))
> ("[[:alnum:]]*B[[:alnum:]]*"
> (";[[:blank:]]*\\(?1:[[:alnum:]]*\\)\\(?2:B\\)\\(?3:[[:alnum:]]*\\)"
> nil nil
> (1 font-lock-warning-face)
> (2 font-lock-keyword-face)
> (3 font-lock-warning-face)))))
> ```
Thanks, that actually seems to work! I'll see if I can adapt it to my actual
use-case.
--
Joost Kremers
Life has its moments
^ permalink raw reply [flat|nested] 10+ messages in thread
* Font lock question
@ 2008-05-28 10:34 Shaun Johnson
0 siblings, 0 replies; 10+ messages in thread
From: Shaun Johnson @ 2008-05-28 10:34 UTC (permalink / raw)
To: GNU Emacs Help
Hi,
I can define a mode to to render colour names on an appropriate background like so:
(define-derived-mode sj-colour-words-mode nil "Colour Words"
"Colour words appropriately."
(setq font-lock-defaults
'(
(("blue" 0 '(face (fixed-pitch (:background "blue"))))
("red" 0 '(face (fixed-pitch (:background "red"))))
("green" 0 '(face (fixed-pitch (:background "green")))))
t t)))
These elements of font-lock-keywords correspond to the form described as
(MATCHER . SUBEXP-HIGHLIGHTER) in section '23.6.2 Search Based Fontification' in edition
2.9 of the Elisp manual for emacs 22.2. However I have totally failed to make the form
described as (MATCHER . FACESPEC) where FACESPEC is a list of the form
(face FACE PROP1 VAL1...) work.
I'm sure I'm just misundertanding something and would be happy if someone could show me
a working example of this type of element.
Thanks in advance,
Shaun Johnson.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-03-18 19:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-17 9:40 Font lock question Joost Kremers
2021-03-17 13:39 ` Stefan Monnier
2021-03-17 13:46 ` Joost Kremers
2021-03-17 15:33 ` Stefan Monnier
2021-03-17 16:09 ` Joost Kremers
2021-03-17 16:46 ` Stefan Monnier
2021-03-18 19:41 ` Joost Kremers
2021-03-17 16:31 ` Harald Jörg
2021-03-18 19:53 ` Joost Kremers
-- strict thread matches above, loose matches on Subject: below --
2008-05-28 10:34 Shaun Johnson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).