From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Steve Yegge <stevey@google.com>
Newsgroups: gmane.emacs.devel
Subject: Re: "Font-lock is limited to text matching" is a myth
Date: Tue, 11 Aug 2009 18:58:03 -0700
Message-ID: <9c768dc60908111858n5acf6b9fs43f7fef5c07ad272@mail.gmail.com>
References: <7b501d5c0908091634ndfba631vd9db6502db301097@mail.gmail.com>
	<200908101335.24002.danc@merrillprint.com>
	<e01d8a50908101104i5081852bh6ecc7d900d87d19e@mail.gmail.com>
	<87my67s8mr.fsf@randomsample.de>
	<e01d8a50908101351l1af03242o84513de67eaf46b2@mail.gmail.com>
	<1249942011.29022.15.camel@projectile.siege-engine.com>
	<e01d8a50908101519k75883081h1f8332b7807b7f49@mail.gmail.com>
	<1249955428.29022.186.camel@projectile.siege-engine.com>
	<9c768dc60908102347v57bdf38ara9fe2179f68c07e4@mail.gmail.com>
	<jwvtz0e5qfk.fsf-monnier+emacs@gnu.org>
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary=0016e659f2a6470e6d0470e82681
X-Trace: ger.gmane.org 1250042361 11126 80.91.229.12 (12 Aug 2009 01:59:21 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Wed, 12 Aug 2009 01:59:21 +0000 (UTC)
Cc: Daniel Colascione <danc@merrillpress.com>,
	David Engster <deng@randomsample.de>,
	Daniel Colascione <danc@merrillprint.com>,
	Lennart Borgman <lennart.borgman@gmail.com>,
	Deniz Dogan <deniz.a.m.dogan@gmail.com>, emacs-devel@gnu.org,
	Leo <sdl.web@gmail.com>, Miles Bader <miles@gnu.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Aug 12 03:59:13 2009
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.50)
	id 1Mb37M-00018C-KX
	for ged-emacs-devel@m.gmane.org; Wed, 12 Aug 2009 03:59:09 +0200
Original-Received: from localhost ([127.0.0.1]:46844 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1Mb37K-0001cl-RC
	for ged-emacs-devel@m.gmane.org; Tue, 11 Aug 2009 21:59:06 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1Mb36f-0001MF-Ld
	for emacs-devel@gnu.org; Tue, 11 Aug 2009 21:58:25 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1Mb36a-0001Jp-NN
	for emacs-devel@gnu.org; Tue, 11 Aug 2009 21:58:25 -0400
Original-Received: from [199.232.76.173] (port=58962 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Mb36Z-0001Jd-MH
	for emacs-devel@gnu.org; Tue, 11 Aug 2009 21:58:19 -0400
Original-Received: from smtp-out.google.com ([216.239.33.17]:28655)
	by monty-python.gnu.org with esmtps
	(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
	(envelope-from <stevey@google.com>)
	id 1Mb36Q-00015O-HS; Tue, 11 Aug 2009 21:58:11 -0400
Original-Received: from zps19.corp.google.com (zps19.corp.google.com [172.25.146.19])
	by smtp-out.google.com with ESMTP id n7C1w7ir022782;
	Wed, 12 Aug 2009 02:58:08 +0100
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta;
	t=1250042288; bh=BdrdLrQi/4AtxACI4YkdUIzc3pY=;
	h=DomainKey-Signature:MIME-Version:In-Reply-To:References:Date:
	Message-ID:Subject:From:To:Cc:Content-Type:X-System-Of-Record; b=y
	DKYfVydbdRqQaVmEZupxdTaM+4+fhWhkf7b2oiIziwY0bwK4SqHlLMz5wcmRDXDwnRQ
	48JuAO2LPTujARd29w==
DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to:
	cc:content-type:x-system-of-record;
	b=Zg4PtenkzAjiS6MdxvePFDF/eoQwHlpC3izp7zaFLmQNmgRMA6rYFVycfFE6HCtLl
	nqXWBwv+ojzXWAGRN/1ig==
Original-Received: from gxk24 (gxk24.prod.google.com [10.202.11.24])
	by zps19.corp.google.com with ESMTP id n7C1w3TW023627;
	Tue, 11 Aug 2009 18:58:04 -0700
Original-Received: by gxk24 with SMTP id 24so5290566gxk.1
	for <multiple recipients>; Tue, 11 Aug 2009 18:58:03 -0700 (PDT)
Original-Received: by 10.90.181.8 with SMTP id d8mr294363agf.75.1250042283756; Tue, 11 
	Aug 2009 18:58:03 -0700 (PDT)
In-Reply-To: <jwvtz0e5qfk.fsf-monnier+emacs@gnu.org>
X-System-Of-Record: true
X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:114118
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/114118>

--0016e659f2a6470e6d0470e82681
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

On Tue, Aug 11, 2009 at 9:04 AM, Stefan Monnier <monnier@iro.umontreal.ca>wrote:

>
> Bringing espresso-mode and js2-mode closer together would be good
> (e.g. by merging them into a single mode with customization options
> allowing to choose between different ways to do highlighting, imenu,
> etc...).


I've now had a chance to take a quick look at espresso-mode, and it
looks like it's most likely a better choice for inclusion in Emacs, assuming
it works well.  I haven't had a chance to use it for actual work yet, but I
will at the earliest opportunity.

js2-mode was a fairly quick/dirty mode wrapped around a parser that was
written to feed an interpreter, not to feed an editing mode.  espresso-mode
is clearly designed from the ground up to be a good Emacs mode for JS.

I like the idea of making js2-mode a minor mode that can supply parse
errors and warnings asynchronously if enabled.  That's its core strength in
any case.  If Daniel's amenable I'd be happy to start working with him in
that direction.


> Especially on the indentation side since it seems they ahre some of
> their history.
>

I'm guessing that indentation needs a lot more work to make it
configurable enough to satisfy the needs of companies like mine, where
the coding style guides would specify indentation rules in pixels/points
if they could get away with it.

It would be nice if someone would dive in and enhance cc-engine to
handle JavaScript constructs.  I may give it another go at some point
if nobody else does.



> > Daniel's objections to js2-mode's non-interaction with font-lock
> > apply equally to the non-interaction with cc-engine's indentation
> > configuration system.  The indent configuration for JavaScript should
> > share as many settings as practical with cc-mode.
>
> I'm not too fond of cc-mode's indentation code and configuration,
> actually, so I don't mind if js2-mode doesn't share anything with it
> (tho I won't oppose a change in this respect either).
>

I'd like to hear more about the objections.  I realize it's horribly
complex,
but I've looked at other configurable indenters and they always wind up
being too complex for most users as well -- gigantic manuals, custom
minilanguages, the works.


>
> >   3) indentation in "normal" Emacs modes also runs synchronously as
> >      the user types.  Waiting 500-800 msec or more for the parse to
> >      finish is (I think) not acceptable for indentation.  For small
> >      files the parse time is acceptable, but it would not be generally
> >      scalable.
>
> Agreed.  Do you happen to know who other IDEs do about it?
>

Yes.  Eclipse has a fast, inaccurate parser that runs inline as you type,
and a slow, accurate one that lags behind by a few hundred ms.  (They
handle name resolution this way as well, except the fast one runs with
the slow parser, and the slow one can take minutes to hours.)

Eclipse uses the fast/inaccurate parser for both fontification and
indentation.  The slower parser is used when (for instance) you want to
reformat a block of code -- something like a cross between indent-region
and fill-region for source code.  I'm not an Eclipse user myself, so I'm
not familiar with all the ins and outs, but this is the basic approach they
take.


> > Va) Inadequate/insufficient style names
>
> [ Putting on my functional programmer hat here. ]
> All you're saying here is that your languages have too many concepts.
>

Yes, well, of course.  But they're not _my_ languages now, are they?


>
> [ Putting on my Emacs maintainer hat again. ]
> highlighting should be about helping the user understand his code:
> highlighting every character with a different color is the way to
> get there.


I'm not 100% sure I follow this statement.  Do you mean "not the way"?


> You may want to help him find structure (e.g. make
> function/method declaration stand out), you may want to help him not get
> confused (highlight strings and comments differently), you may want to
> attract his attention to weird things (undeclared variables, ...), but
> I highly doubt that highlighting function parameters differently from
> local variables will help her in any way.
>

We don't know this, and in fact cannot know it a priori, since there are new
languages appearing all the time.  And templates with mixed languages
complicate things further.

I think it's best not to be in the business of dictating or even advising
taste.
We should focus on making things flexible enough for people to make the
distinctions they wish to make.


>
> This said, the set of default faces deserves a rethink as well as some
> additions, yes.
>
> > languages, not the intersection.  There should, for instance, be a
> > font-lock-symbol-face for languages with distinguished symbols such
> > as Lisp, Scheme and Ruby.
>
> What for?


I meant for quoted symbols -- e.g. I color 'foo, :foo and #'foo differently
in my elisp code.  They all have colors from the same region of rgb space,
but they're different enough that I can tell keyword args from
function-quoted
symbols with just a glance from far away.  I find this helpful.  Others'
MMV.


>
> > Vf) No font-lock interface for setting exact style runs
> [...]
> > The problem is that I need a way, in a given font-lock redisplay, to
> > say "highlight the region from X to Y with text properties {Z}".
>
> I'm not sure I understand the problem.  What's wrong with
> put-text-property?
>

font-lock biffs my properties.


>
> > When I assert that it's not possible, I understand that it's
> > _theoretically_ possible.  Given a JavaScript file with 2500 style
> > runs, assuming I had that information available at font-lock time, I
> > could return a matcher that contains 2500 regular expressions, each
> > one of which is tailored to match one and exactly one region in the
> > buffer.
>
> Just place in font-lock-keywords a MATCHER that is a function whose code
> walks the list of your "runs" (checking which of your runs are within
> (point) and LIMIT) and uses add-text-properties on them; and finally
> returns nil.


This is different from what Daniel advised, and neither approach is very
well documented (if at all).  I will try them both and report back when I
can.


> > Vg) Lack of differentiation between mode- and minor-mode styles
> [...]
> > As far as I can tell, the officially supported mechanism for
> > adding additional font-lock patterns is `font-lock-add-keywords'.
> > This either appends or prepends the keywords to the defaults.
>
> Yes, this sucks.  It should be replaced by a more declarative interface.
>

A simple workaround for now might be to keep pointers to the originals
and the extras in an alist that remembers who added which keywords.

-steve


>
>
>        Stefan
>

--0016e659f2a6470e6d0470e82681
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Tue, Aug 11, 2009 at 9:04 AM, Stefan Monnier <span dir=3D"ltr">&lt;<a hr=
ef=3D"mailto:monnier@iro.umontreal.ca">monnier@iro.umontreal.ca</a>&gt;</sp=
an> wrote:<br><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" =
style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class=3D"im"><br>
</div>Bringing espresso-mode and js2-mode closer together would be good<br>
(e.g. by merging them into a single mode with customization options<br>
allowing to choose between different ways to do highlighting, imenu,<br>
etc...).</blockquote><div><br></div><div>I&#39;ve now had a chance to take =
a quick look at espresso-mode, and it</div><div>looks like it&#39;s most li=
kely a better choice for inclusion in Emacs, assuming</div><div>it works we=
ll. =A0I haven&#39;t had a chance to use it for actual work yet, but I</div=
>
<div>will at the earliest opportunity.</div><div><br></div><div>js2-mode wa=
s a fairly quick/dirty=A0mode wrapped around a parser that was</div><div>wr=
itten to feed an interpreter, not=A0to feed an editing mode. =A0espresso-mo=
de</div>
<div>is clearly designed from the ground up to be a good Emacs mode for JS.=
</div><div><br></div><div>I like the idea of making js2-mode a minor mode t=
hat can supply parse</div><div>errors and warnings asynchronously if enable=
d. =A0That&#39;s its core strength in</div>
<div>any case. =A0If Daniel&#39;s amenable I&#39;d be happy to start workin=
g with him in</div><div>that direction.</div><div><br></div><blockquote cla=
ss=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;pa=
dding-left:1ex;">
<br>
Especially on the indentation side since it seems they ahre some of<br>
their history.<br>
<div class=3D"im"></div></blockquote><div><br></div><div>I&#39;m guessing t=
hat indentation needs a lot more work to make it</div><div>configurable eno=
ugh to satisfy the needs of companies like mine, where</div><div>the coding=
 style guides would specify indentation rules in pixels/points</div>
<div>if they could get away with it.</div><div><br></div><div>It would be n=
ice if someone would dive in and enhance cc-engine to</div><div>handle Java=
Script constructs. =A0I may give it another go at some point</div><div>if n=
obody else does.</div>
<div>=A0</div><div>=A0</div><blockquote class=3D"gmail_quote" style=3D"marg=
in:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class=3D"i=
m">
&gt; Daniel&#39;s objections to js2-mode&#39;s non-interaction with font-lo=
ck<br>
&gt; apply equally to the non-interaction with cc-engine&#39;s indentation<=
br>
&gt; configuration system. =A0The indent configuration for JavaScript shoul=
d<br>
&gt; share as many settings as practical with cc-mode.<br>
<br>
</div>I&#39;m not too fond of cc-mode&#39;s indentation code and configurat=
ion,<br>
actually, so I don&#39;t mind if js2-mode doesn&#39;t share anything with i=
t<br>
(tho I won&#39;t oppose a change in this respect either).<br>
<div class=3D"im"></div></blockquote><div><br></div><div>I&#39;d like to he=
ar more about the objections. =A0I realize it&#39;s horribly complex,</div>=
<div>but I&#39;ve looked at other configurable indenters and they always wi=
nd up</div>
<div>being too complex for most users as well -- gigantic manuals, custom</=
div><div>minilanguages, the works.</div><div>=A0</div><blockquote class=3D"=
gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-=
left:1ex;">
<div class=3D"im"><br>
&gt; =A0 3) indentation in &quot;normal&quot; Emacs modes also runs synchro=
nously as<br>
&gt; =A0 =A0 =A0the user types. =A0Waiting 500-800 msec or more for the par=
se to<br>
&gt; =A0 =A0 =A0finish is (I think) not acceptable for indentation. =A0For =
small<br>
&gt; =A0 =A0 =A0files the parse time is acceptable, but it would not be gen=
erally<br>
&gt; =A0 =A0 =A0scalable.<br>
<br>
</div>Agreed. =A0Do you happen to know who other IDEs do about it?<br>
<div class=3D"im"></div></blockquote><div><br></div><div>Yes. =A0Eclipse ha=
s a fast, inaccurate parser that runs inline as you type,</div><div>and a s=
low, accurate one that lags behind by a few hundred ms. =A0(They</div><div>=
handle name resolution this way as well, except the fast one runs with</div=
>
<div>the slow parser, and the slow one can take minutes to hours.)</div><di=
v><br></div><div>Eclipse uses the fast/inaccurate parser for both fontifica=
tion and</div><div>indentation. =A0The slower parser is used when (for inst=
ance) you want to</div>
<div>reformat a block of code -- something like a cross between indent-regi=
on</div><div>and fill-region for source code. =A0I&#39;m not an Eclipse use=
r myself, so I&#39;m</div><div>not familiar with all the ins and outs, but =
this is the basic approach they</div>
<div>take.</div><div>=A0</div><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class=3D=
"im">
&gt; Va) Inadequate/insufficient style names<br>
<br>
</div>[ Putting on my functional programmer hat here. ]<br>
All you&#39;re saying here is that your languages have too many concepts.<b=
r>
</blockquote><div><br></div><div>Yes, well, of course. =A0But they&#39;re n=
ot _my_ languages now, are they?</div><div>=A0</div><blockquote class=3D"gm=
ail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-le=
ft:1ex;">
<br>
[ Putting on my Emacs maintainer hat again. ]<br>
highlighting should be about helping the user understand his code:<br>
highlighting every character with a different color is the way to<br>
get there. =A0</blockquote><div><br></div><div>I&#39;m not 100% sure I foll=
ow this statement. =A0Do you mean &quot;not the way&quot;?</div><div>=A0</d=
iv><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left=
:1px #ccc solid;padding-left:1ex;">
You may want to help him find structure (e.g. make<br>
function/method declaration stand out), you may want to help him not get<br=
>
confused (highlight strings and comments differently), you may want to<br>
attract his attention to weird things (undeclared variables, ...), but<br>
I highly doubt that highlighting function parameters differently from<br>
local variables will help her in any way.<br>
</blockquote><div><br></div><div>We don&#39;t know this, and in fact cannot=
 know it a priori, since there are new</div><div>languages appearing all th=
e time. =A0And templates with mixed languages</div><div>complicate things f=
urther.</div>
<div><br></div><div>I think it&#39;s best not to be in the business of dict=
ating or even advising taste.</div><div>We should focus on making things fl=
exible enough for people to make the</div><div>distinctions they wish to ma=
ke.</div>
<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex;"><br>
This said, the set of default faces deserves a rethink as well as some<br>
additions, yes.<br>
<div class=3D"im"><br>
&gt; languages, not the intersection. =A0There should, for instance, be a<b=
r>
&gt; font-lock-symbol-face for languages with distinguished symbols such<br=
>
&gt; as Lisp, Scheme and Ruby.<br>
<br>
</div>What for?</blockquote><div><br></div><div>I meant for quoted symbols =
-- e.g. I color &#39;foo, :foo and #&#39;foo differently</div><div>in my el=
isp code. =A0They all have colors from the same region of rgb space,</div>
<div>but they&#39;re different enough that I can tell keyword args from fun=
ction-quoted</div><div>symbols with just a glance from far away. =A0I find =
this helpful. =A0Others&#39; MMV.</div><div>=A0</div><blockquote class=3D"g=
mail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-l=
eft:1ex;">

<div class=3D"im"><br>
&gt; Vf) No font-lock interface for setting exact style runs<br>
</div>[...]<br>
<div class=3D"im">&gt; The problem is that I need a way, in a given font-lo=
ck redisplay, to<br>
&gt; say &quot;highlight the region from X to Y with text properties {Z}&qu=
ot;.<br>
<br>
</div>I&#39;m not sure I understand the problem. =A0What&#39;s wrong with<b=
r>
put-text-property?<br>
<div class=3D"im"></div></blockquote><div><br></div><div>font-lock biffs my=
 properties.</div><div>=A0</div><blockquote class=3D"gmail_quote" style=3D"=
margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class=
=3D"im">
<br>
&gt; When I assert that it&#39;s not possible, I understand that it&#39;s<b=
r>
&gt; _theoretically_ possible. =A0Given a JavaScript file with 2500 style<b=
r>
&gt; runs, assuming I had that information available at font-lock time, I<b=
r>
&gt; could return a matcher that contains 2500 regular expressions, each<br=
>
&gt; one of which is tailored to match one and exactly one region in the<br=
>
&gt; buffer.<br>
<br>
</div>Just place in font-lock-keywords a MATCHER that is a function whose c=
ode<br>
walks the list of your &quot;runs&quot; (checking which of your runs are wi=
thin<br>
(point) and LIMIT) and uses add-text-properties on them; and finally<br>
returns nil.</blockquote><div><br></div><div>This is different from what Da=
niel advised, and neither approach is very</div><div>well documented (if at=
 all). =A0I will try them both and report back when I can.</div><div>=A0</d=
iv>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex;"><div class=3D"im">
&gt; Vg) Lack of differentiation between mode- and minor-mode styles<br>
</div>[...]<br>
<div class=3D"im">&gt; As far as I can tell, the officially supported mecha=
nism for<br>
&gt; adding additional font-lock patterns is `font-lock-add-keywords&#39;.<=
br>
&gt; This either appends or prepends the keywords to the defaults.<br>
<br>
</div>Yes, this sucks. =A0It should be replaced by a more declarative inter=
face.<br>
<div class=3D"im"><font class=3D"Apple-style-span" color=3D"#888888"><font =
class=3D"Apple-style-span" color=3D"#000000"></font></font></div></blockquo=
te><div><br></div><div>A simple workaround for now might be to keep pointer=
s to the originals</div>
<div>and the extras=A0in an alist that remembers who added which keywords.<=
/div><div><br></div><div>-steve</div><div>=A0</div><blockquote class=3D"gma=
il_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-lef=
t:1ex;">
<div class=3D"im"><font class=3D"Apple-style-span" color=3D"#888888"><font =
class=3D"Apple-style-span" color=3D"#000000"><br></font></font></div><font =
color=3D"#888888">
<br>
 =A0 =A0 =A0 =A0Stefan<br>
</font></blockquote></div><br>

--0016e659f2a6470e6d0470e82681--