unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* font-lock-syntactic-keywords: evaluating arbitrary elisp inside matchers?
@ 2012-09-21 20:48 immerrr again...
  2012-09-25  1:03 ` Stefan Monnier
  0 siblings, 1 reply; 8+ messages in thread
From: immerrr again... @ 2012-09-21 20:48 UTC (permalink / raw)
  To: emacs-devel

Hi all

I'm hacking lua-mode in my spare time and one thing that bothered me a
lot is Lua's long-bracket-constructs. For those who don't know what I'm
talking about, here's a short recap:

. A _long bracket of level N_ consists of two square brackets with N
   equals signs between them, N >= 0.

. An _opening long bracket_ has two _opening square brackets_, e.g "[[",
   "[=[", "[===[".

. A _closing long bracket_ has two _closing square brackets_, e.g. "]]",
   "]=]", "]===]".

. A _long string_ starts with an opening long bracket of any level and
   ends at the first closing long bracket of the same level.

. A comment starts with a double hyphen "--" anywhere outside a string.
   If the text immediately after "--" is not an opening long bracket, the
   comment is a _short comment_, which runs until the end of the line.
   Otherwise, it is a _long comment_, which runs until the corresponding
   closing long bracket.

Here are some characteristic situations for the rules above:

. code [[ string ]] code

. code -- comment till EOL

. code --[[ comment ]] code

. code -- [[ comment till EOL ]]
   because '--' is followed by space

. code ---[[ comment till EOL ]]
   because '--' is followed by '-'

. code [===[ string   ]=] string --[=[ string ]===] code

. code [===[ string ]===]  code  --[=[ comment ]=] code

Obviously, Emacs character syntax flags are not enough to describe that.
Currently, I'm trying to use `font-lock-syntactic-keywords' with the
following rule to capture both long comments and long strings:

1.    `(,(rx
2.        (or (seq (or line-start (not (any "-")))
3.                 (group-n 1 "-") "-[" (group-n 5 (0+ "=")))
4.            (seq (group-n 3 "[")      (group-n 6 (0+ "="))))

5.        "[" (minimal-match (0+ anything)) "]"

6.        (or (seq (backref 5) (group-n 2 "]"))
7.            (seq (backref 6) (group-n 4 "]"))))

8.     (1 "!" nil t) (2 "!" nil t)
9.    (3 "|" nil t) (4 "|" nil t))

The construct is probably not obvious, so I'll elaborate a little bit.

Lines 2-4 match opening brackets with optional double-dash and equals
signs, line 2 makes sure that 3+ consecutive dash situations are
unmatched and are interpreted as short comment. Note also, that there
are separate groups sequences of equals signs for strings and comments,
we'll get to them later.

Line 5 matches inner square brackets and the content of the literal.

Lines 6 and 7 match closing equals signs and square bracket. The
matching alternative is deduced from the fact that if an optional group
doesn't match, its backref won't match too. So, if it's a long comment,
then the regexp will match groups 1 and 5 for the opening bracket --
even if 5 is empty -- and 2 for the closing one since backref 6 won't
match. If it's a long string, vice-versa, it will match groups 3,4 and
6.

Kudos to those who made it to this point, here go the questions:

. Is it actually worth the while to optimize propertizing this way or
   probably two separate rules would perform just as fine?

. The obvious simplification would be to match both brackets with single
   captures and choose proper syntax flag programmatically depending on
   if there's a leading double dash. The documentation states smth about
   SYNTAX component of MATCH-HIGHLIGHT being "an expression whose value
   is such a form", can I leverage that here?

--
Cheers, immerrr




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-03-26 13:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-21 20:48 font-lock-syntactic-keywords: evaluating arbitrary elisp inside matchers? immerrr again...
2012-09-25  1:03 ` Stefan Monnier
2012-09-25 11:31   ` immerrr again...
2012-09-25 13:20     ` Stefan Monnier
2012-09-28  8:19       ` immerrr again
2012-09-28 12:28         ` Stefan Monnier
2012-09-29  6:50           ` immerrr again
2013-03-26 13:48           ` immerrr again

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).