From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stephen Leake Newsgroups: gmane.emacs.devel Subject: Re: syntax-propertize-function vs indentation lexer Date: Sat, 01 Jun 2013 01:19:24 -0400 Message-ID: <85obbq76hv.fsf@member.fsf.org> References: <85mwrdbypv.fsf@member.fsf.org> <85bo7sbzhh.fsf@member.fsf.org> <85k3mf8uf0.fsf@member.fsf.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1370063984 23319 80.91.229.3 (1 Jun 2013 05:19:44 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 1 Jun 2013 05:19:44 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Jun 01 07:19:45 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UieE9-0001sm-Q4 for ged-emacs-devel@m.gmane.org; Sat, 01 Jun 2013 07:19:42 +0200 Original-Received: from localhost ([::1]:39153 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UieE9-0000Kw-9K for ged-emacs-devel@m.gmane.org; Sat, 01 Jun 2013 01:19:41 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46239) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UieE5-0000Kr-Sl for emacs-devel@gnu.org; Sat, 01 Jun 2013 01:19:39 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UieE4-00041K-IX for emacs-devel@gnu.org; Sat, 01 Jun 2013 01:19:37 -0400 Original-Received: from vms173003pub.verizon.net ([206.46.173.3]:61729) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UieE4-00040x-DM for emacs-devel@gnu.org; Sat, 01 Jun 2013 01:19:36 -0400 Original-Received: from TAKVER ([unknown] [71.241.247.125]) by vms173003.mailsrvcs.net (Sun Java(tm) System Messaging Server 7u2-7.02 32bit (built Apr 16 2009)) with ESMTPA id <0MNP00HY384IDM50@vms173003.mailsrvcs.net> for emacs-devel@gnu.org; Sat, 01 Jun 2013 00:19:31 -0500 (CDT) In-reply-to: (Stefan Monnier's message of "Fri, 31 May 2013 09:23:37 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (windows-nt) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 206.46.173.3 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:159961 Archived-At: Stefan Monnier writes: >>> Sounds expensive. How does it cope with large buffers? > >> Not clear yet - I'm still getting the Ada grammar right. > >> The parser is actually generalized LALR, which spawns parallel parsers >> for grammar conflicts and ambiguities. So it can be very slow when the >> grammar has too many conflicts or is ambiguous - running 64 parsers in >> parallel is a lot slower than running 1 :). But it works well when the >> conflict can be resolved in a few tokens, and is much easier than >> reconstructing the grammar to eliminate the conflict. > > Aha! So the parser is a separate executable written in some other > language than Elisp, I guess? In that case parsing speed should not be > a serious concern (except for the possible explosion of parallel > parsers). No, it's in elisp. See http://stephe-leake.org/emacs/ada-mode/emacs-ada-mode.html for the code. > Although, using it for indentation makes speed a real concern: > in many cases one does "edit+reindent", so if you put a "full reparse" > between the two, it needs to be about as fast as instantaneous. That's how my current SMIE-based parser works, and it's "fast enough". I'm working on replacing it with an LALR parser, because the resulting code is much cleaner. >> Such a file would never be accepted in any project I have worked on, in >> any source language, unless it was generated from some other source. > > I agree that 1MB is very unusual, but emacs/src/xdisp.c is pretty damn > close to 1MB. And I've seen several times files of several hundred > KB. Ok. >> I'm also using the cached data for navigation (moving from 'if' to >> 'then' to 'elsif' to 'end if' etc); that is logically independent of >> indentation (but not of the parser, of course). > > Then navigation should also call syntax-propertize (indeed smie's sexp > navigation also calls syntax-propertize for the same reason). Yes, I think this is the best solution. >> Since the parser is asynchronous from the indentation, it would have to >> go in the parser (actually lexer) code. wisi-forward-token would be a >> logical place. But what would be the right guess for 'end'? The first >> step in wisi-forward-token is forward-comment, which can skip quite large >> portions of the buffer. > > I have the same problem in SMIE navigation, indeed. For backward > navigation, that's not a problem, but for forward navigation, I don't > have a good answer. Luckily, SMIE mostly cares about backward > navigation since that's what needed for indentation, but currently > forward navigation can bump into parse bugs for failure of calling > syntax-propertize on the text being considered. In my case, putting the call to syntax-propertize in wisi-parse-buffer, not in wisi-forward-token, solves the problem; wisi-parse-buffer always parses the whole buffer :). This could easily be generalized to take an 'end' arg. >> How does syntax.el take care of this? The only function on >> after-change-functions by default is jit-lock-after-change. And that's >> only there if font-lock is on. > > It's added to before-change-functions. Doh! Thanks, -- -- Stephe