From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel,gmane.emacs.cc-mode.general Subject: Re: A possible way for CC Mode to resolve its sluggishness Date: Sun, 28 Apr 2019 21:46:25 -0400 Message-ID: References: <20190426193056.GC4720@ACM> <20190427135725.GB4822@ACM> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="85788"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: cc-mode-help@lists.sourceforge.net, emacs-devel@gnu.org To: Alan Mackenzie Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Apr 29 03:46:41 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hKvNQ-000M9l-8X for ged-emacs-devel@m.gmane.org; Mon, 29 Apr 2019 03:46:40 +0200 Original-Received: from localhost ([127.0.0.1]:50593 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hKvNP-0000Be-3c for ged-emacs-devel@m.gmane.org; Sun, 28 Apr 2019 21:46:39 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:42446) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hKvNG-0008T5-Rz for emacs-devel@gnu.org; Sun, 28 Apr 2019 21:46:32 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hKvNF-0007c9-BE for emacs-devel@gnu.org; Sun, 28 Apr 2019 21:46:30 -0400 Original-Received: from pruche.dit.umontreal.ca ([132.204.246.22]:52273) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hKvNF-0007bZ-6h for emacs-devel@gnu.org; Sun, 28 Apr 2019 21:46:29 -0400 Original-Received: from ceviche.home (lechon.iro.umontreal.ca [132.204.27.242]) by pruche.dit.umontreal.ca (8.14.7/8.14.1) with ESMTP id x3T1kPkc007430; Sun, 28 Apr 2019 21:46:26 -0400 Original-Received: by ceviche.home (Postfix, from userid 20848) id 4BF526619A; Sun, 28 Apr 2019 21:46:25 -0400 (EDT) In-Reply-To: <20190427135725.GB4822@ACM> (Alan Mackenzie's message of "Sat, 27 Apr 2019 13:57:25 +0000") X-NAI-Spam-Flag: NO X-NAI-Spam-Threshold: 5 X-NAI-Spam-Score: 0 X-NAI-Spam-Rules: 2 Rules triggered EDT_SA_DN_PASS=0, RV6534=0 X-NAI-Spam-Version: 2.3.0.9418 : core <6534> : inlines <7062> : streams <1820024> : uri <2838289> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 132.204.246.22 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:236020 gmane.emacs.cc-mode.general:7615 Archived-At: >> Part of the problem is that CC-mode is very eager in its management of >> syntax information: the `syntax-table` text-properties are always kept >> up-to-date over the whole buffer right after every single change. > That is not part of the problem. That is part of the challenge. Keeping everything always up-to-date, whether we use them or not, is not in itself of any benefit to the end-user. So it's a self-imposed challenge. I think here are enough challenges without having to add this one to the lot. >> Modes using syntax-propertize work more lazily: >> before-change-functions only marks that some change occurred at >> position POS and the syntax-table properties after that position are >> only updated afterward on-demand. > > Yes, but it is somewhat unclear whether, how, and when modes using > syntax-propertize can update syntax-table properties on positions > _before_ a change. When syntax-propertize-function is called on BEG..END it's not expected to touch anything outside of the BEG..END region (I think it's perfectly safe to touch things after END, but it's definitely risky to touch things before BEG). When a change at buffer position POS requires updating syntax-table (or other) text-properties at some earlier position, this can be indicated to syntax-propertize via syntax-propertize-extend-region-functions. >> Maybe another part of the problem is that CC-mode tries to do more than >> most other major modes: e.g. the highlighting of unclosed strings. >> For plain single-line strings this can be fairly cheap, but for >> multiline strings, keeping this information constantly up-to-date over >> the whole buffer can be costly. > > CC Mode is successful in this regard. The highlighting with > warning-face of unclosed string openers is a useful feature which other > modes could emulate. I don't think "successful" is an appropriate description (e.g. I don't know what a failure would be). What I do know, is that this highlighting imposes significant additional work (e.g. because it more often requires updating text-properties before the place where text was modified). And that it's an incomplete feature: it only does that for strings. Other major modes provide a more complete implementation of that feature via flymake/lsp without imposing that extra work on before/after-change-functions. It also means that those modes don't try to give a definite answer to questions of what to do in the face of invalid code (e.g. non-terminated strings or comments, mismatched parens, ...) a limit themselves to try and avoid doing something "obviously" wrong in those case. I know you think this is a bad design decision, but it's a design decision that is largely unavoidable (e.g. C-mode has to do the same when faced with some uses of CPP which render the "un-preprocessed" code unparsable) and Emacs usually gets a lot of benefits from it in terms of simplicity and performance. > But that's a digression from the topic of this thread. Indeed. >> Not sure whether you intend this to be just a change to CC-mode (it does >> sound like it can all be implemented in Elisp) or you intend for some >> change at the C level. > At the Lisp level. I hadn't even considered any C enhancements. Good. Then I think it's worth a try. >> But to tell you the truth, other than CC-mode, I'm having a hard time >> imagining which other major mode will want to use such a thing. >> Performance of syntax-propertize is not stellar but doesn't seem >> problematic, and it is not too hard to use (its functioning is not >> exactly the same as what a real lexer would do, but you can make use of >> the language spec more or less straightforwardly), .... > Again, can syntax-propertize work on positions _before_ a buffer change? Yes. There has not been much need for it, tho, so support for it is fairly primitive. E.g. most modes that have such needs don't use syntax-propertize-extend-region-functions but rely on jit-lock-multiline, which makes this kind of "update an earlier part of the buffer" happen at a later time, and hence causes the text-properties to be left invalid for a while. Delaying updates this way is OK for font-lock highlighting, but is wrong for syntax-propertize, so we should change those modes to use syntax-propertize-extend-region-functions where relevant. >> Maybe if we want to speed things up, we should consider a new parsing >> engine (instead of parse-partial-sexp and syntax-tables) based maybe on >> a DFA for the tokenizer and GLR parser on top. That might arguably be >> more generally useful and easier to use (in the sense that one can more >> or less follow the language spec when implementing the major mode). > That would be a lot of design and a lot of work, and sounds like > something from the distant rather than medium future. Agreed. > The indentation and font-lock routines would have to be rewritten for > each mode using it. It would probably be a good idea to make use of parsing info for those, but not indispensable. Stefan