After writing my original email I thought about something a bit different, and I managed (with suggestions and help from Anders Lindgren) to write a convincing (to me :) proof of concept. The idea is to use a separate buffer to do the fontification. I've attached the code; after loading it, it's enough to run (font-lock-add-keywords nil '(("^ *>>> \\(.*\\)" (0 (indirect-font-lock-highlighter 1 'python-mode))))) Stefan (and emacs-devel!), do you think I should add this to ELPA? Are there downsides I should be aware of? Cheers, Clément. On 2016-10-15 11:19, Clément Pit--Claudel wrote: > Hi emacs-devel, > > Some languages have a way to quote code in comments. Some examples: > > * Python > > def example(foo, *bars): > """Foo some bars""" > > >>> example(1, > ... 2, > ... 3) > 3 > > >>> example(4, 8) > 67 > """ > > * Coq > > Definition example foo bars := > (* [example foo bars] uses [foo] to foo some [bars]. For example: > << > Compute (example 1 [2, 3]). > (* 3 *) > >> *) > > In Python, ‘>>>’ indicates a doctest (a small bit of example code). In Coq, ‘[…]’ and ‘<<…>>’ serve as markers (inside of comments) of single-line (resp multi-line) code snippets. At the moment, Emacs doesn't highlight these snippets. I originally asked about this in http://emacs.stackexchange.com/questions/19998/code-blocks-in-font-lock-comments , but received no answers. > > There are multiple currently-available workarounds, but none of them that I know of are satisfactory: > > * Duplicate all font-lock rules, creating anchored matchers that recognize code in comments. The duplication is very unpleasant, and it will require adding ‘prepend’ to a bunch of font-lock rules, which will break some of them. > > * Use a custom syntax-propertize-function to recognize these code snippets and escape out of strings. This has some potential, but it confuses existing tools. For example, in Python, one can do the following; it works fine for ‘>>>’ in comments, but in strings it seems to break eldoc, among others: > > syntax-ppss() > python-util-forward-comment(1) > python-nav-end-of-defun() > python-info-current-defun() > (let ((current-defun (python-info-current-defun))) (if current-defun (progn (format "In: %s()" current-defun)))) > > (defconst litpy--doctest-re > "^#*\\s-*\\(>>>\\|\\.\\.\\.\\)\\s-*\\(.+\\)$" > "Regexp matching doctests.") > > (defun litpy--syntax-propertize-function (start end) > "Mark doctests in START..END." > (goto-char start) > (while (re-search-forward litpy--doctest-re end t) > (let* ((old-syntax (save-excursion (syntax-ppss (match-beginning 1)))) > (in-docstring-p (eq (nth 3 old-syntax) t)) > (in-comment-p (eq (nth 4 old-syntax) t)) > (closing-syntax (cond (in-docstring-p "|") (in-comment-p ">"))) > (reopening-syntax (cond (in-docstring-p "|") (in-comment-p "<"))) > (reopening-char (char-after (match-end 2))) > (no-reopen (eq (and reopening-char (char-syntax reopening-char)) > (cond (in-comment-p ?>))))) > (when closing-syntax > (put-text-property (1- (match-end 1)) (match-end 1) > 'syntax-table (string-to-syntax closing-syntax)) > (when (and reopening-char (not no-reopen)) > (put-text-property (match-end 2) (1+ (match-end 2)) > 'syntax-table (string-to-syntax reopening-syntax))))))) > > > Maybe the second approach can be made to more-or-less work for Python, despite the issue above — I'm not entirely sure. The idea there is to detect chunks of code, and mark their starting and ending characters in a way that escapes from the surrounding comment or string. > > But this doesn't solve the problem for Coq, for example, because it confuses comment-forward and the like. Some coq tools depend on Emacs to identify comments and skip over them when running a file (code is sent bit by bit, so if ‘(* foo [some code here] bar *)’ is annotated with syntax properties to make Emacs think that it should be understood as ‘(* foo *) some code here (* bar *)’, then Proof General (a Coq IDE based on Emacs) won't realize that “some code here” is part of a comment, and things will break. > > I'm not sure what the right approach is. I guess there are two approaches: > > * Mark embedded code in comments as actual code using syntax-propertize-function, and add a way for tools to detect this "code but not really code" situation. Pros: things like company, eldoc, prettify-symbols-mode, etc. will work in embedded code comments without having to opt them in. Cons: some things will break, and will need to be fixed (comment-forward, Proof General, Elpy, indentation functions…). > > * Add new "code block starter"/"code-block-ender" syntax classes? Then font-lock would know that it has to highlight these. Pros: few things would break. Cons: Tools would have to be opted-in (company-mode, eldoc, prettify-symbols-mode, …). > > Am I missing another obvious solution? Has this topic been discussed before? > > Cheers, > Clément. > >