From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Alan Mackenzie <acm@muc.de>
Newsgroups: gmane.emacs.devel
Subject: Re: [Emacs-diffs] comment-cache 223d16f 2/3: Apply `comment-depth'
	text properties when calling `back_comment'.
Date: Mon, 14 Mar 2016 17:29:40 +0000
Message-ID: <20160314172940.GG1894@acm.fritz.box>
References: <20160309174816.GE3948@acm.fritz.box> <56E0805F.3050804@gmx.at>
	<20160312170839.GE2572@acm.fritz.box>
	<e0287d91-d5a4-eb2e-2bdb-1373fd6a865d@yandex.ru>
	<20160312215839.GC10781@acm.fritz.box>
	<c1692a21-eb08-cba1-7160-151df1318b57@yandex.ru>
	<20160313175922.GE1871@acm.fritz.box>
	<0ce1b5a5-6892-47ad-03d4-d4c2ba2bea54@yandex.ru>
	<20160314122330.GC1894@acm.fritz.box>
	<dc6bb324-9d45-a3b1-d843-23d25dda8596@yandex.ru>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: ger.gmane.org 1457976458 27804 80.91.229.3 (14 Mar 2016 17:27:38 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Mon, 14 Mar 2016 17:27:38 +0000 (UTC)
Cc: emacs-devel@gnu.org
To: Dmitry Gutov <dgutov@yandex.ru>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Mar 14 18:27:29 2016
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1afWH8-0007bG-FQ
	for ged-emacs-devel@m.gmane.org; Mon, 14 Mar 2016 18:27:26 +0100
Original-Received: from localhost ([::1]:42710 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1afWH4-0005mr-Fv
	for ged-emacs-devel@m.gmane.org; Mon, 14 Mar 2016 13:27:22 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:50534)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
	id 1afWGm-0005lE-RN
	for emacs-devel@gnu.org; Mon, 14 Mar 2016 13:27:09 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <acm@muc.de>) id 1afWGi-0005jO-Nz
	for emacs-devel@gnu.org; Mon, 14 Mar 2016 13:27:04 -0400
Original-Received: from mail.muc.de ([193.149.48.3]:16809)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
	id 1afWGi-0005iH-Ea
	for emacs-devel@gnu.org; Mon, 14 Mar 2016 13:27:00 -0400
Original-Received: (qmail 68409 invoked by uid 3782); 14 Mar 2016 17:26:59 -0000
Original-Received: from acm.muc.de (p579E8DE5.dip0.t-ipconnect.de [87.158.141.229]) by
	colin.muc.de (tmda-ofmipd) with ESMTP;
	Mon, 14 Mar 2016 18:26:57 +0100
Original-Received: (qmail 5107 invoked by uid 1000); 14 Mar 2016 17:29:40 -0000
Content-Disposition: inline
In-Reply-To: <dc6bb324-9d45-a3b1-d843-23d25dda8596@yandex.ru>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
X-Primary-Address: acm@muc.de
X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x
X-Received-From: 193.149.48.3
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:201720
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/201720>

Hello, Dmitry.

On Mon, Mar 14, 2016 at 06:15:13PM +0200, Dmitry Gutov wrote:
> On 03/14/2016 02:23 PM, Alan Mackenzie wrote:

> >> Any performance numbers on using this approach in C/l mode?

> > [ By "this approach" I'm taking it you mean scanning forward from safe
> > positions.]

> No, using syntax-ppss's cache. Including the syntax-ppss-last variable.

CC Mode doesn't use syntax-ppss.  It would be too much work to put it in,
particularly as that function's future is unclear.

> IIRC, Stefan posted a patch, which was like 10 lines long.

Sorry, patch for what?  I've lost the context.

> Could you please try it already, so we can move on to discussing the 
> actual performance problems of syntax-ppss, instead of theoretical ones?

It would be a lot of work.  Perhaps you might like to undertake it.

> > It's two slow to use as a replacement for back_comment, in the sense I
> > would like to do - in place of scanning a comment backward character by
> > character, I want to use a cache (calculated by forward scanning) to
> > determine the beginning of a comment.  Having to scan forward lots of
> > characters (as opposed to a few) is out of the question for this
> > application.

> The approach X is out of the question, because my approach Y is usually 
> N times faster!

> Why is it out of the question? Can I say "premature optimization"?

You could, but it would sound a bit silly.  back_comment currently works
by scanning backward a character at a time.  Each character will take
about the same time to scan (give or take a factor of, perhaps, 2) as a
character being scanned forward in parse-partial-sexp.

Compare scanning backwards over a 100 character comment using the current
back_comment with scanning forwards up to 20,000 characters to get a
parse state on the comment's end position.  CC Mode does a fair bit of
scanning backwards of comments sequentially.  Even syntax-last-pos (or
whatever it's called) won't help much here.

But if you want to do the experiment, be my guest.

> >> So I have to wonder why the "get out of a comment" feature is used in
> >> C/l mode so much that it becomes a bottleneck, and you get significant
> >> improvement in performance by dropping the caching logic to C. That is,
> >> of course, not a nice thing to ask considering the overall complexity of
> >> CC Mode, but still.

> >> I don't see anything comparable to 10 second waiting described in
> >> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22884, when doing a
> >> comparable operation in a 5000-line Ruby file.

> > Here's a quick summary of how that happened:  on L1661 of config.h, C
> > Mode had to scan back to the beginning of a statement.  That involved
> > going back over ~1600 lines of comments and preprocessor constructs.

> I went back to the revision a589e9a and rebuilt Emacs. I see the problem 
> described in 22884. However, the line 1661 already is a beginning of a 
> statement. It contains:

> #define _DARWIN_USE_64_BIT_INODE 1

> Are we talking about the same file?

Possibly.  config.h is different between different runs of ./configure.
In the file Paul was complaining about, there was a comment opener at the
start of the line.  I think it was actually the line you've cited, but
commented out.

> > When it got to the critical comment and tried to go back over it,
> > (forward-comment -1) said nil, because of that paren in column 0.  That
> > paren in column 0, although in a comment, was deemed to be the start of
> > a defun.  C Mode was then trying to parse "code" over a region with a
> > 1600 line gap in the middle.
> > Hence the 10 second delay in seeing the
> > character echoed.  Paul has purged our code of parens in column 0.  But
> > it would be nice not to have the restriction.

> (parse-partial-sexp 1 (point)) on line 1661 takes ~0.002s here.

> (parse-partial-sexp 1 (point-max)) takes just a little above that.

> How do these timings translate into whole seconds of waiting after 
> pressing '/'?

They don't.  It was CC Mode's indentation engine's scanning, not the raw
parse-partial-sexp scanning.

> > My point was that it is so simple that it _could_ be written in C, and
> > that without any great difficulties.

> "It was so simple that I made it more complex"? I mentioned C as a 
> drawback, and it is.

Lots of Emacs is written in C.  In this particular case, it is a good
idea to avoid the possible complications and pitfalls of calling lisp
from C when it is not necessary, particularly given the lack of any
involved code which would make lisp the preferred language.

> > Parts of it can only be written in
> > C (the bits that ensure the cache is marked stale when certain
> > sytax-table text properties are set/cleared when
> > `inhibit-modification-hooks' is bound to non-nil).

> These would have to be carefully considered, but if they make sense, 
> they would have to be ported to syntax-ppss too somehow.

That wouldn't be a bad idea, once the future of that function becomes
clear.

-- 
Alan Mackenzie (Nuremberg, Germany).