From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Eric M. Ludlam" Newsgroups: gmane.emacs.devel Subject: Re: "Font-lock is limited to text matching" is a myth Date: Mon, 10 Aug 2009 21:50:28 -0400 Message-ID: <1249955428.29022.186.camel@projectile.siege-engine.com> References: <7b501d5c0908091634ndfba631vd9db6502db301097@mail.gmail.com> <200908101335.24002.danc@merrillprint.com> <87my67s8mr.fsf@randomsample.de> <1249942011.29022.15.camel@projectile.siege-engine.com> Reply-To: eric@siege-engine.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1249955412 11743 80.91.229.12 (11 Aug 2009 01:50:12 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 11 Aug 2009 01:50:12 +0000 (UTC) Cc: Daniel Colascione , David Engster , Daniel Colascione , emacs-devel@gnu.org, Steve Yegge , Stefan Monnier , Deniz Dogan , Leo , Miles Bader To: Lennart Borgman Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Aug 11 03:50:04 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MagV1-0005L4-Vs for ged-emacs-devel@m.gmane.org; Tue, 11 Aug 2009 03:50:04 +0200 Original-Received: from localhost ([127.0.0.1]:60333 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MagV0-0007Y8-Qr for ged-emacs-devel@m.gmane.org; Mon, 10 Aug 2009 21:50:02 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MagUv-0007Y3-Bl for emacs-devel@gnu.org; Mon, 10 Aug 2009 21:49:57 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MagUt-0007Xr-Uk for emacs-devel@gnu.org; Mon, 10 Aug 2009 21:49:56 -0400 Original-Received: from [199.232.76.173] (port=53381 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MagUt-0007Xo-Py for emacs-devel@gnu.org; Mon, 10 Aug 2009 21:49:55 -0400 Original-Received: from static-71-184-83-10.bstnma.fios.verizon.net ([71.184.83.10]:36666 helo=projectile.siege-engine.com) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MagUq-0005p4-Px; Mon, 10 Aug 2009 21:49:53 -0400 Original-Received: from projectile.siege-engine.com (localhost [127.0.0.1]) by projectile.siege-engine.com (8.14.3/8.14.3/Debian-6) with ESMTP id n7B1obB1003940; Mon, 10 Aug 2009 21:50:39 -0400 Original-Received: (from zappo@localhost) by projectile.siege-engine.com (8.14.3/8.14.3/Submit) id n7B1oS4V003932; Mon, 10 Aug 2009 21:50:28 -0400 X-Authentication-Warning: projectile.siege-engine.com: zappo set sender to eric@siege-engine.com using -f In-Reply-To: X-Mailer: Evolution 2.26.1 X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:114015 Archived-At: On Tue, 2009-08-11 at 00:19 +0200, Lennart Borgman wrote: > On Tue, Aug 11, 2009 at 12:06 AM, Eric M. Ludlam wrote: > > Hi Eric, > > > The concept of using the Semantic parser/generator framework for > > improving font-locking accuracy has come up many times. No-one to my > > knowledge has attempted to mix the two. > > > Maybe that can easier be done if Semantic parser use > font-lock/JIT-lock timers and marking to keep track of what need to be > reparsed? (It is just a wild idea perhaps.) I'm not certain of how the font/jit lock works. Semantic works by tracking edits (after-change-functions) and then on it's own timer, it coalesces the changes into parsable units. It then reparses those units. Font lock can refontify based on fairly small subsections of a buffer, such as a single code line, or a comment section. Semantic's subsections are the size of functions, variables, and datatypes (ie, the tags it creates.) > > The CONS are that everything in Semantic is set up to parse the entire > > buffer in one pass, and to parse logical sub-sections only after a full > > parse has been done. > > > So you do a first pass with coarse parsing and then you look in the > sub-sections for details? Is this strictly necessary? I guess you are > looking for top level definitions in the first pass? > > Could that pass have its own state and continue upon demand (when an > item is not recognized) or is such a logic impossible? It could, but I have not done so. Tagging information is not generally needed right away, so just waiting for the user to either ask for it, or sit idle for a while works pretty well. The overhead of such an incremental parser isn't really needed. > (I guess font-lock/JIT-lock could be improved to help with keeping > track of what parts of the buffer that have been parsed/maybe > fontified.) The needs between the tagging parser and the font-lock parser are different. Font lock needs to colorize arbitrary blocks of text, and a tagging parser needs to parse everything, but only needs the data periodically. Converting a tagging parser to a colorizing parser would be challenging because of these different uses. > > I would imagine that the parsing engine in Semantic, if it is deemed > > critical by the maintainers, will get faster if key components are > > integrated into the C code. > > Is that part stable? Yes. Not much is going on there. > > Lastly, as David Engster stated, CEDET has decoration tools that > > decorate entire tags in some way, such as putting a line on top of > > functions. This is a separate decoration activity not related to font > > lock, and something font lock would not be able to do reliably. > > Why not if it asks the parser? Font lock runs long before the parser bothers trying to catch up. Font lock would needs hooks for after the parser runs. problems. While font lock and semantic share a need for a parsing infrastructure, the where/when of the parsing is quite different. It is possible to conceptually mix and match the parsers vs the schedulers. In practice, the two tools have their own lengthy histories that will make that challenging. Before tackling such a project, it would be wise to take multi-mode (or similar tool) into account. Eric