Richard Stallman wrote: > believe that I found a solution that does the right thing in most cases > and will send it to you in the next days. > > Could you dscribe in words what it does? Attached find a file called `lisp-font-lock-regexp.el' which contains all changes I propose. You may try to load it, make the face definitions meet your requirements, and look whether it works. Syntax-highlighting and decoration for lisp-font-lock-keywords-2 must be activated. Eventually someone would have to decide on appropriate names and defaults for faces. I have set regexp highlighting to the minimum level 1. If this were incorporated in font-lock.el, the standard level should be 0 - which means no regexp highlighting and thus no obtrusiveness. Emacs would behave as before the introduction of regexp highlighting a couple of weeks ago. Level 1 does regexp highlighting as introduced recently with some minor bug fixes. Levels 2 and 3 should do something that was proposed in font-lock.el but commented out due to problems with an "unbreakable endless loop". Level 2 does this for regexp groups on a single line only. Level 3 should handle regexp groups spanning several lines as well. By no means the default level should equal 3 as will become evident from remarks below. The variable `lisp-font-lock-regexp' can be used to set the default level. Individual buffer settings can be achieved by using the command `lisp-font-lock-regexp'. Levels 2 and 3 use the syntax-table property to remove parenthesis syntax from unescaped parentheses and escaped brackets within regexp groups. I added syntax-table to `font-lock-extra-managed-props' since I don't want font-lock to perform the extra syntactic fontification pass. This idea is non-standard and could be defeated by anyone who removed syntax-table from that list - so far no one seems to use syntax-table properties in elisp-mode. With that property paren-matching/blinking and forward/backward-sexp should work "as intended" within parenthetical groups. You may have noticed my simple-minded posting on emacs-pretest-bug about forward-sexp not being able to handle unescaped semicolons within strings. I resolved the problem by setting the syntax-table property of `;' to punctuation within regexp groups. For a similar reason I reset the escape syntax property of single backslashes preceding parentheses and brackets. I do not treat special characters "as ordinary ones if they are in contexts where their special meanings make no sense". Hence, subexpressions like \\(\\[[^]]*]\\)* in `reftex-extract-bib-entries-from-thebibliography' \\(\\[[^\\]]*\\]\\)? in `reftex-all-used-citation-keys' \\`\\(\\\\[sS]?.\\|\\[\\^?]?[^]]*]\\|[^\\]\\) in `gnus-score-regexp-bad-p' \\(\[[0-9]+\] \\)* in `gud-jdb-marker-filter' do contain mismatches. With level 3 highlighting I'm using the font-lock-multiline property. Apparently this property is used by `smerge.el' too. Consequently, I cannot simply reset the variable `font-lock-multiline' to nil when I switch to a lower level. I believe that this variable - and the variable `parse-sexp-lookup-properties' as well - should be handled in a way similar to hooks or `buffer-invisibility-spec'. Anyone who wants to set these variables should create or append its name to a corresponding list and remove its name to eventually reset the variable. Routines checking the value of the variable would not be affected by this convention. Likely font-lock-multiline, syntax-table and `lisp-font-lock-regexp' prefixed properties should be added to `yank-excluded-properties' too. I've been experimenting a bit with level 3 highlighting. With a 200MHz PC the results are negative: Fontifiying a buffer is moderatly slow, modifying text is hardly supportable. With a 1GHz PC I did not encounter substantial difficulties with one exception - fontifying `cperl-init-faces' took a couple of seconds. I tried to look at bit closer what's going on. When I scrolled down through `cperl.el' and looked at what font-lock is doing I found out that the range from position 168761 to 172839 gets fontified no less than _seven_ times in sequence: Apparently `xdisp.c' - encountering an unfontified object at a position START - asks `jit-lock-function' to fontify from position START. jit-lock-function now calls `jit-lock-fontify-now' to fontify from START to (+ START jit-lock-chunk-size). The latter sets the fontified property for this region to t. `font-lock-default-fontify-region' detects that there is a font-lock-multiline pattern, fontifies the entire region from beginning to end of the pattern - the 168761 to 172839 region above - but does not set the fontified property for this region. I simply inserted `(put-text-property beg end 'fontified t)' in the text of `font-lock-default-fontify-region' right before it calls `font-lock-unfontify-region' and the problem disappeared. When I change some text within a font-lock-multiline pattern of `cperl-init-faces' font-lock refontifies the entire area twice which can take a couple of seconds. What happens here? The first refontification is triggered by redisplay which encounters an unfontified thing it should display (the thing was unfontified by `jit-lock-after-change' previously). The second refontification is eventually triggered by `jit-lock-context-fontify' which unfontifies everything from `jit-lock-context-unfontify-pos' until point-max. However, the second refontification is useless because font-lock-default-fontify-region already took care of the font-lock multiline pattern. Moreover, the second fontification usualy occurs right after the first has finished _before_ I am able to enter the next character. I could resolve this by having font-lock-default-fontify-region fontify a region iff it has not fontified exactly that region already since the last modification of the buffer. But font-lock-multiline patterns do not seem suited for handling this problem anyway. Patterns spanning more than a couple of lines - your mileage may vary - will delay redisplay because inserting one single character triggers refontification of the _entire_ pattern. It should be possible to resolve this problem by using the `jit-lock-defer-multiline' property. However, the latter is broken. Suppose I used jit-lock-defer-multiline instead of font-lock-multiline for my pattern. Inserting a character now will not delay redisplay anymore since font-lock-default-fontify-region does not cater for jit-lock-defer-multiline. Eventually, jit-lock-context-fontify will unfontify the relevant parts of my buffer from the start of the pattern to point-max, and everything should get fontified correctly. It does not, however, when the jit-lock-defer-multiline pattern starts _before_ `window-start': After jit-lock-context-fontify has unfontified the buffer, redisplay - for some reason I did not investigate - intercepts this by fontifying the _visible_ part of the buffer without caring about my pattern. Eventually, the invisible parts get refontified but the already fontified part doesn't because, as mentioned before, font-lock-default-fontify-region does not know jit-lock-defer-multiline patterns. Hence, fontification appears incorrect. I'm afraid there are no simple patches for this. Hence I provided the appropriate warnings that level 3 highlighting should be used with sufficient care. The feature I propose could be quite useful for people who write regular expressions only occasionally and I don't want to compromise it on behalf of the recent controversies on font-lock-comment-delimiter and font-lock-negation-char-face faces. On the other hand, I don't want to give pretext to anyone who plans to introduce yet another feature in the pre-release phase. Hence if you think that this should be delayed or cancelled please tell me so. I've also experimented with a patch of `show-paren-function' where I overlay the backslashes in `\\(...\\)' groups with the respective count of that group. Hence I don't have to literally step through such pairs when searching for the subexpressions referenced by match-string, match-beginning, ...