From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Stefan Monnier <monnier@iro.umontreal.ca>
Newsgroups: gmane.emacs.devel
Subject: Re: Last steps for pretesting (font-lock-extend-region-function)
Date: Mon, 24 Apr 2006 17:06:24 -0400
Message-ID: <873bg2g1fw.fsf-monnier+emacs@gnu.org>
References: <Pine.LNX.3.96.1060424160818.515B-100000@acm.acm>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: sea.gmane.org 1145912814 11820 80.91.229.2 (24 Apr 2006 21:06:54 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Mon, 24 Apr 2006 21:06:54 +0000 (UTC)
Cc: Richard Stallman <rms@gnu.org>, emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Apr 24 23:06:50 2006
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by ciao.gmane.org with esmtp (Exim 4.43)
	id 1FY8Gc-0003L1-LA
	for ged-emacs-devel@m.gmane.org; Mon, 24 Apr 2006 23:06:47 +0200
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1FY8Gc-0000xc-60
	for ged-emacs-devel@m.gmane.org; Mon, 24 Apr 2006 17:06:46 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1FY8GR-0000xV-9D
	for emacs-devel@gnu.org; Mon, 24 Apr 2006 17:06:35 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1FY8GQ-0000xJ-Dz
	for emacs-devel@gnu.org; Mon, 24 Apr 2006 17:06:34 -0400
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1FY8GQ-0000xA-66
	for emacs-devel@gnu.org; Mon, 24 Apr 2006 17:06:34 -0400
Original-Received: from [209.226.175.93] (helo=tomts36-srv.bellnexxia.net)
	by monty-python.gnu.org with esmtp (Exim 4.52)
	id 1FY8Ic-0005RE-5l; Mon, 24 Apr 2006 17:08:50 -0400
Original-Received: from alfajor ([70.55.144.118]) by tomts36-srv.bellnexxia.net
	(InterMail vM.5.01.06.13 201-253-122-130-113-20050324) with ESMTP
	id <20060424210624.TMXU13653.tomts36-srv.bellnexxia.net@alfajor>;
	Mon, 24 Apr 2006 17:06:24 -0400
Original-Received: by alfajor (Postfix, from userid 1000)
	id 9FE42D7855; Mon, 24 Apr 2006 17:06:24 -0400 (EDT)
Original-To: Alan Mackenzie <acm@muc.de>
In-Reply-To: <Pine.LNX.3.96.1060424160818.515B-100000@acm.acm> (Alan
	Mackenzie's message of "Mon, 24 Apr 2006 19:28:44 +0000 (GMT)")
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:53355
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/53355>

>>> What is your objection to f-l-extend-region-f in a-c-f?  Is it because it
>>> might gobble too much processor time?

>> That and the fact that it's not necessary, and that to use it (as seen
>> in your example) you need a good bit more work/code than doing it "the
>> other way" (with code solely run from font-lock-default-fontify-region).

> It will need an order of magnitude less work for f-l-extend-region-f than
> for f-l-multiline properties.

Are you talking about CPU-work or programmer work?
I was talking about programmer work, and I've shown you the code for
font-lock-multiline: it's quite a bit shorter and more straightforward than
your hack using b-c-f and a-c-f: just match the multiline element, place
a font-lock-multiline property on it, and you're set.

> At least, it will for those hackers who aren't totally familiar with the
> internals of font-lock-keywords, etc.

You don't have to be familiar with the internals.  At least not more than
necessary to write the proper foo-before-change-function plus
foo-font-lock-lock-extend-region-after-change.

> f-l-e-r-f only needs to be applied at the extremes of the region.  f-l-m
> needs to be applied throughout the entire region.  I doubt very much
> whether f-l-m would use less processing power than f-l-e-r-f.

As I said, the important aspect is not how much time it takes each time it's
executed, but how many times it's executed.  In an after-change-function, it
risks being executed enough times to beomce visible to the user.

> Also, f-l-extend-region-function (called from f-l-a-c-f/j-l-a-c) will work
> EVERY time, because of its directness and simplicity.

Reality check?  Here's your code:

   (defvar c-awk-old-EOLL 0)
   (make-variable-buffer-local 'c-awk-old-EOLL)
   ;; End of logical line following the region which is about to be changed.
   ;; Set in c-awk-before-change and used in c-awk-font-lock-extend-region.
   
   (defun c-awk-before-change (beg end)
   ;; This function is called exclusively from the before-change-functions hook.
   ;; It does two things: Finds the end of the (logical) line on which END lies,
   ;; and clears c-awk-NL-prop text properties from this point onwards.
     (save-restriction
       (save-excursion
         (setq c-awk-old-EOLL (c-awk-end-of-logical-line end))
         (c-save-buffer-state nil
          (c-awk-clear-NL-props end (point-max))))))
   (add-hook 'before-change-functions c-awk-before-change nil t)
   
   (defun c-awk-end-of-change-region (beg end old-len)
     ;; Find the end of the region which needs to be font-locked after a change.
     ;; This is the end of the logical line on which the change happened, either
     ;; as it was before the change, or as it is now, whichever is later.
     ;; N.B. point is left undefined.
     (max (+ (- c-awk-old-EOLL old-len) (- end beg))
          (c-awk-end-of-logical-line end)))

Here's my alternative implementation:

   (defconst c-awk-font-lock-keywords
      '(...
        (".*\\\\\n.*"
         (0 (progn (put-text-property (match-beginning 0) (match-end 0)
                                      'font-lock-multiline t)
                   nil)))
        ...))

Now which one's more direct and simple, really?

> Using only f-l-m will fail in some circumstances.

Please backup such claims.

> In the most general situation, a regexp in f-l-keywords will be
> inadequate - a defun will be needed.  This will be at least as
> complicated as the one for f-l-extend-region-f.

So you mean in the most general case, it'll be no worse than your code.
Here, we agree ;-)

> Writing a f-l-extend-region-f does not require any detailed knowledge of
> font-lock-keywords - thus the two problems of determining the region to
> fontify and of specifying how to fontify it are kept separate, hence
> simplified.  Abusing f-l-keywords to set exotic text properties (as
> against its primary purpose, to set faces) requires contorted thinking.
> To write a f-l-e-r-f, one need only read a single page of the elisp
> manual.

And then one needs to think hard about how to actually write the code, what
are the relevant odd cases one has to worry about, where to store the info
you need from b-c-f to a-c-f, ... so much fun, indeed.

Before/after-change-functions are great hammers, but you know what they
say about hammers.

> It's me that's going to have to implement this mechanism for CC Mode (all
> modes other than AWK Mode).  It's going to be difficult enough as it is,
> without the added complications of the difficult to understand the structure
> font-lock-keywords.

Just take a deep breath, a couple of steps back, and try to look at the
example use of font-lock-multiline above as if it were very simple.

>>> It needs to be in j-l-fontify-now, so that that function knows what
>>> bytes to apply the 'fontified property to.

>> This is not needed for correctness.  It's only a matter of avoiding
>> redundant work.  This need wouldn't be new (it has existed since
>> Emacs-21 becaue of font-lock-multiline) and doesn't need to be addressed
>> right away.
> It might cause errors,

Again, please backup such claims.

> and it will certainly increase run time.

In some cases it will, yes.  Those cases tend to suffer from other
performance problems anyway, tho (i.e. they typically involve large regions
that need to be rehighlighted atomically, so even if you fix this
jit-lock-fontify-now performance misfeature the underlying performance
issue will still bite you and you'll end up having to use a different
solution for which the jit-lock-fontify-now problem won't apply either).
At least this was my experience with smerge-mode and it didn't seem
particularly specific to my situation.

> It isn't much effort to be correct, here.  Why not do it right?  (Yes, I'm
> prepared to do this patch, too.)

It's enough changes that I don't think it should go into Emacs-22.  I prefer
keeping the same performance misfeature that we already had in Emacs-21 and
release Emacs-22 a few days earlier.

But if you want to fix it, here's how a good fix should work IMNSHO:
- every function that get added to jit-lock-functions (such as
  font-lock-fontify-region) should be changed to return the region
  it actually jit'd.
- jit-lock-fontify-now should then take the intersection of the returned
  regions to determine where to set the `fontified' property.

>> What about the sample solution I provided: it's all manipulated from
>> font-lock-keywords (i.e. from font-lock-default-fontify-region)?

> It won't work in every case.  An after-change function (at the very
> least) will need to be added to your solution to make it robust.

Again, I see no evidence of this.  Back up your claim.

> Yes, absolutely.  There are so many special cases, font-lock-keywords has
> a complicated structure, it would be very easy to put the property on
> ".*\\\\$" rather than ".*\\\\\n", for example.

The only really important part where the font-lock-multiline property should
be added is the text that covers all the "backslash newlines" of
a multiline line.  Whether it includes the last ".*$" or the leading "^.*"
doesn't really matter.
It's not nearly as brittle as you think.

>>> 5. I suspect you think that f-l-extend-region-f is likely to consume
>>> more processor time than direct manipulation of the f-l-m property.

>> The problem is not what is done but when.  after-change-functions get run
>> more often than font-lock-fontify-region.

> No it doesn't.  font-lock-fontify-region gets run at every change (unless
> jit-lock-defer-time is non-nil - it's nil by default).

No, it's run at every redisplay.  For self-insert-command, it's usually the
same as "every change", but for many other commands it can be quite
different (or even for self-insert-command, if you trigger an abbrev
expansion or if you trigger auto-fill, ...).

"at every redisplay" basically means "at human pace", whereas "after every
change" means "at CPU pace".

> I say to you again - your solution is not robust.  I don't think it's
> been tried at all (correct me if I'm wrong).  A variant form of
> f-l-extend-region-f (the advice on the f-l a-c-functions, as implemented
> for AWK Mode) has been in use since June 2003, and so far no bugs related
> to it have been reported by users.

Without using a hook in font-lock-default-fontify-region you can't claim
it's robust: I can easily cook up a case where it fails.

> We are both aware of a refinement which is needed, namely calling some
> sort of f-l-extend-region-f from f-l-d-f-r.

It's not a refinement.  It's a requirement for correctness.


        Stefan