From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: Bug #25608 and the comment-cache branch Date: Tue, 14 Feb 2017 21:14:23 +0000 Message-ID: <20170214211423.GA3090@acm> References: <20170202215154.GB2505@acm> <83h94bvhzw.fsf@gnu.org> <20170203172952.GC2250@acm> <0a40d539-b7bc-2655-5429-6280022106ee@yandex.ru> <20170204102410.GA2047@acm> <8f9e68fc-4314-625d-b4bf-796c71c91798@yandex.ru> <20170206192423.GB3568@acm> <4f0fabf3-be9c-7492-379b-59dc93e72b4f@yandex.ru> <20170207192119.GA2490@acm> <424e6409-029c-d15d-421c-4fb90594329c@yandex.ru> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1487106961 24373 195.159.176.226 (14 Feb 2017 21:16:01 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 14 Feb 2017 21:16:01 +0000 (UTC) User-Agent: Mutt/1.7.2 (2016-11-26) Cc: Eli Zaretskii , emacs-devel@gnu.org To: Dmitry Gutov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Feb 14 22:15:57 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cdkRz-0005pN-MT for ged-emacs-devel@m.gmane.org; Tue, 14 Feb 2017 22:15:51 +0100 Original-Received: from localhost ([::1]:37195 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cdkS5-0007Mj-Ab for ged-emacs-devel@m.gmane.org; Tue, 14 Feb 2017 16:15:57 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:43822) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cdkR1-0007Ju-FA for emacs-devel@gnu.org; Tue, 14 Feb 2017 16:14:53 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cdkQx-0003yF-Dl for emacs-devel@gnu.org; Tue, 14 Feb 2017 16:14:51 -0500 Original-Received: from ocolin.muc.de ([193.149.48.4]:27450 helo=mail.muc.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1cdkQx-0003xY-4U for emacs-devel@gnu.org; Tue, 14 Feb 2017 16:14:47 -0500 Original-Received: (qmail 64388 invoked by uid 3782); 14 Feb 2017 21:14:44 -0000 Original-Received: from acm.muc.de (p548C6D80.dip0.t-ipconnect.de [84.140.109.128]) by colin.muc.de (tmda-ofmipd) with ESMTP; Tue, 14 Feb 2017 22:14:43 +0100 Original-Received: (qmail 4281 invoked by uid 1000); 14 Feb 2017 21:14:23 -0000 Content-Disposition: inline In-Reply-To: <424e6409-029c-d15d-421c-4fb90594329c@yandex.ru> X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 193.149.48.4 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:212387 Archived-At: Hello, Dmitry. On Tue, Feb 14, 2017 at 17:28:58 +0200, Dmitry Gutov wrote: > Hi Alan, > On 07.02.2017 21:21, Alan Mackenzie wrote: > >> How come the "alternative patch" works well, then? > > Well, aside from the fact that it doesn't (IMAO), it is only consulted > > relatively rarely, in certain cases of back_coment where the backward > > scanning hits something it doesn't want to handle. > What is "it"? Respectively, the "alternative patch", the syntax-ppss cache mechanism, and the backward scanning, in the three uses in that sentence. Sorry it wasn't clear. > I would imagine that to be sure that point is not e.g. inside a > string, the patch would have to consult the cache (or call syntax-ppss) > at least once per forward-comment call. Indeed. > From there, I don't really see a real need for backward comment > scanning. So if you rewrote some code to use forward scanning, that > approach should be applicable on top of the AP as well. The back_comment function needs to use backward scanning unless it has a robust enough and fast enough cache giving it the literality at any point. [ .... ] > More importantly, I want to keep as much logic in Lisp as feasible, > which is the currently recommended approach anyway. I sometimes think you might be trying to keep more in Lisp than is feasible. > Problems like this could be solved in different ways without avoiding > that goal. We can provide new faster primitives if manipulating some > data structure in Lisp is not enough (but we need benchmarks first, and > so far, speed is not a problem). We can also add new hooks if > before-change-functions is not up to snuff. In other words, implementing new logic in C. One thing which cannot be done in lisp (without new facilities in C) is invalidating the cache when syntax-table text properties are applied and removed (which is always done when the change hooks are inhibited). You can do it directly in C, you can write new facilities in C to allow it to be done in lisp, or you can pretend it doesn't need doing. comment-cache takes the first approach, syntax-ppss takes the last at the moment. > >> Tracking the used syntax table is also a problem which we need to solve > >> for syntax-ppss. A good design could handle it and narrowing together. > > You should now be able to see why I dislike syntax-ppss so much. As > > well as being incompatible with narrowing (which should be sort of > > fixable), there is an essential lack of cache invalidating (which would > > only be fixable by a radically different design). > Why wouldn't it be fixable with a moderate change in design? The problem > you are referring to (which is almost entirely theoretical at this > point, in the absence of bug reports) .... Here I disagree with you and Stephan profoundly - A flaw is a flaw whether or not it has yet provoked a bug report. And just because something hasn't yet had a bug report on it doesn't mean it's OK. If we see a way non-rigorous coding _can_ lead to a bug, then that is a bug and needs fixing, particularly if it's in a primitive. > .... is caused by syntax-ppss usage inside with-silent-modifications. Yes. > > It's differently complicated. master's back_comment, which attempts to > > scan comments backwards is more complicated than comment-cache's > > back_comment (including its cacheing logic). > Ideally we'd have the best of both worlds, of course. Like mentioned > above, I see no hard need for backward scanning anymore. But for reasonable execution speeds you either need backward scanning of comments, or comment-cache (or something like it). > >> Yes, it is worse. You have more code to debug. And comment-cache adds > >> quite a bit of code. > > How have you quantified "quite a bit"? > 771 insertions(+), 402 deletions? Admittedly, this is not a lot by C > standards. I don't think it is, either. A good deal of that is the wholesale replacement of back_comment with the simpler new version. > > There is nothing to indicate you've even looked at comment-cache. All > I've looked at it now. Since it's implemented in C, I have little > ability to judge the quality of the code, or the low-level nuances. > And yet, I've managed to provide coherent comments, haven't I? You have, yes. Thanks. [ .... ] > >> So the "speed up forward-comment" patch would still come out to 20 lines. > > Well, if you get a decent bug fix involving, say a 700 line patch which > > includes those 20 lines, I suppose you could still call it a 20 line > > patch, somehow. > Even if that takes 700 lines, in the end it will be 700 + 20 lines > versus 700 + 370 lines that comment-cache takes. I think it is the result that counts, not the number of changed lines. > >>> It would also likely be much slower. > >> I wouldn't be so sure. A syntax table comparison, for instance, would be > >> pretty cheap compared to what syntax-ppss does already. > > Full syntax-table comparisons are slow, even when written in C. > Really? How do you quantify that? In c++-mode, > (benchmark 1000 '(equal (syntax-table) (syntax-table))) > outputs "Elapsed time: 0.004712s". Which is an order of magnitude less > than (benchmark 1000 '(syntax-ppss)) outputs, in an empty buffer with a > warmed-up cache. > > I tried > > it back in December. CC Mode regularly switches syntax-tables. My > > usual time-scroll function on xdisp.c ran at about half the speed when a > > comparison was done at every set-syntax-table. The results had to be > > cached, after which it ran at normal speed again. > That doesn't tell me a lot, unfortunately. Maybe it was a design > problem, e.g. invalidating cache eagerly and too often, instead of doing > it lazily like syntax-ppss does. It was a case of seeing if two distinct syntax tables were "the same" from the point of view of literals. In other words, they could parse parentheses, whitespace and so on however they liked, but comments and strings had to be parsed identically by both tables for them to count "the same". This is an instance where syntax-ppss's ambitions count against it - on any set-syntax-table syntax-ppss's caches should really be cleared, strictly speaking. But they're less effective as caches if this is done. Perhaps the major mode should give its input. > Although CC Mode would have to change syntax tables a lot, for it to > even show up on the radar. It does. For example, there's a `c-make-no-parens-syntax-table' in which parens/braces/brackets are not paren characters, used for parsing template/generic delimiters in C++ and Java. > It's possible that your "compare syntax tables" routine does a lot, of > course. But if we really need that kind of fuzzy comparison, we can > implement that function in C and export for using in Lisp. I think that's what I did. -- Alan Mackenzie (Nuremberg, Germany).