From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: "Eric M. Ludlam" <eric@siege-engine.com>
Newsgroups: gmane.emacs.devel
Subject: Re: "Font-lock is limited to text matching" is a myth
Date: Mon, 10 Aug 2009 21:50:28 -0400
Message-ID: <1249955428.29022.186.camel@projectile.siege-engine.com>
References: <7b501d5c0908091634ndfba631vd9db6502db301097@mail.gmail.com>
	<buofxc05p1l.fsf@dhlpc061.dev.necel.com>
	<aa6b5cbe0908092251u670fbd3bg2fc4c14857d32c17@mail.gmail.com>
	<200908101335.24002.danc@merrillprint.com>
	<e01d8a50908101104i5081852bh6ecc7d900d87d19e@mail.gmail.com>
	<87my67s8mr.fsf@randomsample.de>
	<e01d8a50908101351l1af03242o84513de67eaf46b2@mail.gmail.com>
	<1249942011.29022.15.camel@projectile.siege-engine.com>
	<e01d8a50908101519k75883081h1f8332b7807b7f49@mail.gmail.com>
Reply-To: eric@siege-engine.com
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
X-Trace: ger.gmane.org 1249955412 11743 80.91.229.12 (11 Aug 2009 01:50:12 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Tue, 11 Aug 2009 01:50:12 +0000 (UTC)
Cc: Daniel Colascione <danc@merrillpress.com>,
	David Engster <deng@randomsample.de>,
	Daniel Colascione <danc@merrillprint.com>, emacs-devel@gnu.org,
	Steve Yegge <stevey@google.com>, Stefan Monnier <monnier@iro.umontreal.ca>,
	Deniz Dogan <deniz.a.m.dogan@gmail.com>, Leo <sdl.web@gmail.com>,
	Miles Bader <miles@gnu.org>
To: Lennart Borgman <lennart.borgman@gmail.com>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Aug 11 03:50:04 2009
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.50)
	id 1MagV1-0005L4-Vs
	for ged-emacs-devel@m.gmane.org; Tue, 11 Aug 2009 03:50:04 +0200
Original-Received: from localhost ([127.0.0.1]:60333 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1MagV0-0007Y8-Qr
	for ged-emacs-devel@m.gmane.org; Mon, 10 Aug 2009 21:50:02 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1MagUv-0007Y3-Bl
	for emacs-devel@gnu.org; Mon, 10 Aug 2009 21:49:57 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1MagUt-0007Xr-Uk
	for emacs-devel@gnu.org; Mon, 10 Aug 2009 21:49:56 -0400
Original-Received: from [199.232.76.173] (port=53381 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1MagUt-0007Xo-Py
	for emacs-devel@gnu.org; Mon, 10 Aug 2009 21:49:55 -0400
Original-Received: from static-71-184-83-10.bstnma.fios.verizon.net
	([71.184.83.10]:36666 helo=projectile.siege-engine.com)
	by monty-python.gnu.org with esmtps
	(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
	(envelope-from <eric@siege-engine.com>)
	id 1MagUq-0005p4-Px; Mon, 10 Aug 2009 21:49:53 -0400
Original-Received: from projectile.siege-engine.com (localhost [127.0.0.1])
	by projectile.siege-engine.com (8.14.3/8.14.3/Debian-6) with ESMTP id
	n7B1obB1003940; Mon, 10 Aug 2009 21:50:39 -0400
Original-Received: (from zappo@localhost)
	by projectile.siege-engine.com (8.14.3/8.14.3/Submit) id n7B1oS4V003932;
	Mon, 10 Aug 2009 21:50:28 -0400
X-Authentication-Warning: projectile.siege-engine.com: zappo set sender to
	eric@siege-engine.com using -f
In-Reply-To: <e01d8a50908101519k75883081h1f8332b7807b7f49@mail.gmail.com>
X-Mailer: Evolution 2.26.1 
X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:114015
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/114015>

On Tue, 2009-08-11 at 00:19 +0200, Lennart Borgman wrote:
> On Tue, Aug 11, 2009 at 12:06 AM, Eric M. Ludlam<eric@siege-engine.com> wrote:
> 
> Hi Eric,
> 
> >  The concept of using the Semantic parser/generator framework for
> > improving font-locking accuracy has come up many times.  No-one to my
> > knowledge has attempted to mix the two.
> 
> 
> Maybe that can easier be done if Semantic parser use
> font-lock/JIT-lock timers and marking to keep track of what need to be
> reparsed? (It is just a wild idea perhaps.)

I'm not certain of how the font/jit lock works.  Semantic works by
tracking edits (after-change-functions) and then on it's own timer, it
coalesces the changes into parsable units.  It then reparses those
units.

Font lock can refontify based on fairly small subsections of a buffer,
such as a single code line, or a comment section.  Semantic's
subsections are the size of functions, variables, and datatypes (ie, the
tags it creates.)

> >  The CONS are that everything in Semantic is set up to parse the entire
> > buffer in one pass, and to parse logical sub-sections only after a full
> > parse has been done.
> 
> 
> So you do a first pass with coarse parsing and then you look in the
> sub-sections for details? Is this strictly necessary? I guess you are
> looking for top level definitions in the first pass?
> 
> Could that pass have its own state and continue upon demand (when an
> item is not recognized) or is such a logic impossible?

It could, but I have not done so.  Tagging information is not generally
needed right away, so just waiting for the user to either ask for it, or
sit idle for a while works pretty well.  The overhead of such an
incremental parser isn't really needed.

> (I guess font-lock/JIT-lock could be improved to help with keeping
> track of what parts of the buffer that have been parsed/maybe
> fontified.)

The needs between the tagging parser and the font-lock parser are
different.  Font lock needs to colorize arbitrary blocks of text, and a
tagging parser needs to parse everything, but only needs the data
periodically.

Converting a tagging parser to a colorizing parser would be challenging
because of these different uses.

> >  I would imagine that the parsing engine in Semantic, if it is deemed
> > critical by the maintainers, will get faster if key components are
> > integrated into the C code.
> 
> Is that part stable?

Yes.  Not much is going on there.

> >  Lastly, as David Engster stated, CEDET has decoration tools that
> > decorate entire tags in some way, such as putting a line on top of
> > functions.  This is a separate decoration activity not related to font
> > lock, and something font lock would not be able to do reliably.
> 
> Why not if it asks the parser?

Font lock runs long before the parser bothers trying to catch up.  Font
lock would needs hooks for after the parser runs.
problems.

While font lock and semantic share a need for a parsing infrastructure,
the where/when of the parsing is quite different.  It is possible to
conceptually mix and match the parsers vs the schedulers.  In practice,
the two tools have their own lengthy histories that will make that
challenging.  Before tackling such a project, it would be wise to take
multi-mode (or similar tool) into account.

Eric