From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Syntax-propertize and CC-mode [Was: Further CC-mode changes] Date: Sat, 13 Sep 2014 23:08:12 +0000 Message-ID: <20140913230812.GA3660@acm.acm> References: <536FEA43.5090402@dancol.org> <20140516175226.GB3267@acm.acm> <537653A0.2070109@dancol.org> <20140518213331.GB2577@acm.acm> <20140912235948.GA4045@acm.acm> <20140913151055.GB3431@acm.acm> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1410650050 1949 80.91.229.3 (13 Sep 2014 23:14:10 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 13 Sep 2014 23:14:10 +0000 (UTC) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Sep 14 01:14:03 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XSwW3-0000zP-CC for ged-emacs-devel@m.gmane.org; Sun, 14 Sep 2014 01:14:03 +0200 Original-Received: from localhost ([::1]:52221 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSwW2-0001AY-OU for ged-emacs-devel@m.gmane.org; Sat, 13 Sep 2014 19:14:02 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46691) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSwVj-0001A6-CI for emacs-devel@gnu.org; Sat, 13 Sep 2014 19:13:50 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XSwVb-0004r2-RK for emacs-devel@gnu.org; Sat, 13 Sep 2014 19:13:43 -0400 Original-Received: from colin.muc.de ([193.149.48.1]:47604 helo=mail.muc.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSwVb-0004qr-D0 for emacs-devel@gnu.org; Sat, 13 Sep 2014 19:13:35 -0400 Original-Received: (qmail 21024 invoked by uid 3782); 13 Sep 2014 23:13:33 -0000 Original-Received: from acm.muc.de (pD9519EBF.dip0.t-ipconnect.de [217.81.158.191]) by colin.muc.de (tmda-ofmipd) with ESMTP; Sun, 14 Sep 2014 01:13:32 +0200 Original-Received: (qmail 4029 invoked by uid 1000); 13 Sep 2014 23:08:12 -0000 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 8.x X-Received-From: 193.149.48.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:174279 Archived-At: Hello again, Stefan. I've separated the discussion on syntax-propertize from the rest, because things are threatening to get out of hand. On Sat, Sep 13, 2014 at 03:24:03PM -0400, Stefan Monnier wrote: > >> You know why: `syntax-propertize' is used by all major modes except > >> cc-mode derived ones. And it's more efficient, because it's more lazy. > > Lazy? Does C-M-f, for example, trigger syntax-propertize in any way? > Not yet, no. OK. I'm not yet convinced that syntax-propertize works well in languages like C++ and Java, where template/generic delimiters < and > are marked with category properties. (These have symbols with syntax-table properties.) Because applying these properties is expensive at runtime, they are tended to with loving care in CC Mode. When a buffer change is made, a before-change function c-before-change-check-<>-operators removes the category properties from exactly those pairs of <>s which need it, in particular, the removal region is LIMITED TO THE CURRENT "STATEMENT". (A template/generic construct cannot include braces or semicolons.) The properties get applied, when necessary, as a side effect of the scanning which happens during font-locking. It is a long-standing bug that they aren't applied when font-locking is disabled. > >> Also its correctness argument is much simpler since it doesn't rely on > >> interactions between before-change-functions, after-change-functions > >> (and font-lock), contrary to your code. > > The argument could run that because syntax-propertize doesn't take > > account of these subleties, its actions aren't always correct. But ... > > syntax-propertize isn't documented in the Elisp manual. It's unclear > > when it gets called (in relation to after-change-functions, for example) > > and even _if_ it always gets called (e.g. when font-lock isn't enabled). > That's because it's irrelevant: if *you* need the properties to be > applied, then call (syntax-propertize POS). If they've already been > applied, then this will return immediately. That's useful to know. I didn't know that (and couldn't have known it) before, since it is undocumented. How is the lower limit determined (matching the upper limit POS)? At a guess, this is going to be anywhere down to BOB. This isn't very lazy. Why doesn't syntax-propertize take TWO parameters, BEG and END? > Since font-lock's syntactic fontification needs those properties to be > applied, it does call syntax-propertize. I didn't know this either. How will this be interacting with C++/Java Modes' use of category properties to set the syntax? I don't have good feelings about this. Looking at the source, syntax-propertize is unoptimised. It will always erase the syntax-table properties from the last buffer change onwards, in contrast to C++ Mode, which only erases them locally as needed. There is surely an opportunity for better customisation here. > Similarly, indent-according-to-mode calls it since it's usually a good > idea. And yes, C-M-f should too, tho noone has been sufficiently > bothered by it to implement it yet. > > Adapting code to use it is thus a _massive_ amount of work - it involves > > reading the source code for syntax-propertize and friends to work out > > what it does. > No you don't: the contract is just what I stated above and you don't > need to know more than that. That is not true. I need to know that syntax-propertize would nullify C++/Java Modes' careful optimisations. I need to know whether or not syntax-propertize properly handles category properties. I suspect it doesn't. That involves reading the source code, for lack of documentation. > > I suspect that for a mode which needs syntax-table text properties > > early in its after-change-functions, syntax-properties wouldn't work. > Suspect all you want. But I assure you it works just fine. > >> > It's a lot of work to change, > >> It's work I did, pro bono. Just enjoy it. > > No, you haven't. What's the equivalent of `syntax-propertize' in XEmacs, > > or an old GNU Emacs? > Did you look at my patch? It only uses syntax-propertize > when available. When not, it keeps using your old code. And the bulk of > the code is shared between the two cases. I looked at the patch, yes, but didn't catch every last detail. There's the philosophical question why add this extra code when it's not replacing old code, but causing bloat. I now think syntactic-propertize contains a severe pessimisation: At a buffer change, all syntax-table properties are effectively erased between the change point and EOB. This wasn't done in the existing AWK Mode code. Perhaps it doesn't matter too much, since AWK buffers tend not to be very big. > > Right. So I'm now going to have three alternative code flows in place of > > the one I currently have: (i) One flow for when font lock's enabled; (ii) > > one for when FL is disabled, which explicitly calls syntax-propertize > > from after-change-functions; (iii) one for Emacsen lacking > > syntax-propertize. I've already got (iii). I think you're asking me to > > add (i) and (ii). This is an increase in complexity and a lot of work. > There are many different ways to handle backward compatibility. You can > do it as above (where my position would be to simply keep the current > code for the old Emacsen that don't have syntax-propertize, so while > there are more different ways to work than you'd like the old ways have > been tested and there's no reason they should stop working, so you can > largely ignore them). > Or you can provide you own implementation of syntax-propertize. > It's actually simple. If you prefer this route, I can do > that as well. syntax-propertize needs a way of customizing its before-change function to avoid removing ST properties needlessly from large chunks of buffer. How about `syntax-depropertize-function', which if non-nil would be run in place of a bit of syntax-ppss-flush-cache? Or something like that? [ .... ] > syntax-propertize uses a before-change-function to "flush the cache", of > course. After that you just call it when you need it. E.g. If > font-lock itself doesn't call it, then we'll need to arrange to run it > before font-lock does its thing. [ .... ] > Stefan -- Alan Mackenzie (Nuremberg, Germany).