From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: Idea for syntax-ppss. Is it new? Could it be any good? Date: Sun, 27 Jul 2008 19:20:45 +0000 Message-ID: <20080727192045.GB1598@muc.de> References: <20080726214429.GB3623@muc.de> <20080727145058.GA1598@muc.de> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1217192872 16304 80.91.229.12 (27 Jul 2008 21:07:52 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 27 Jul 2008 21:07:52 +0000 (UTC) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Jul 27 23:08:41 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KNDTt-00019K-EC for ged-emacs-devel@m.gmane.org; Sun, 27 Jul 2008 23:08:41 +0200 Original-Received: from localhost ([127.0.0.1]:57298 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KNDSz-0000Km-Ii for ged-emacs-devel@m.gmane.org; Sun, 27 Jul 2008 17:07:45 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KNDSW-0000AT-IB for emacs-devel@gnu.org; Sun, 27 Jul 2008 17:07:16 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KNDSU-0000A3-Su for emacs-devel@gnu.org; Sun, 27 Jul 2008 17:07:16 -0400 Original-Received: from [199.232.76.173] (port=34599 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KNDSU-0000A0-PG for emacs-devel@gnu.org; Sun, 27 Jul 2008 17:07:14 -0400 Original-Received: from colin.muc.de ([193.149.48.1]:1392 helo=mail.muc.de) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1KNDSU-0001xJ-HU for emacs-devel@gnu.org; Sun, 27 Jul 2008 17:07:14 -0400 Original-Received: (qmail 93353 invoked by uid 3782); 27 Jul 2008 19:20:30 -0000 Original-Received: from acm.muc.de (pD9E53EF9.dip.t-dialin.net [217.229.62.249]) by colin2.muc.de (tmda-ofmipd) with ESMTP; Sun, 27 Jul 2008 21:20:26 +0200 Original-Received: (qmail 7597 invoked by uid 1000); 27 Jul 2008 19:20:45 -0000 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.9i X-Delivery-Agent: TMDA/1.1.5 (Fettercairn) X-Primary-Address: acm@muc.de X-detected-kernel: by monty-python.gnu.org: FreeBSD 4.6-4.9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:101598 Archived-At: Hi, Stefan, On Sun, Jul 27, 2008 at 11:51:36AM -0400, Stefan Monnier wrote: > >> Isn't that what syntax-ppss does? > > It caches the state for several positions, but I don't think they're at > > regular positions. > C-h v syntax-ppss-max-span > It's not exactly perfectly regular, but I don't think the difference > matters. I was looking at my 3Mb buffer, and it seemed they were at wildly irregular positions. But I think I was seeing something which wasn't really there. > > partial-parse-sexp is blindingly fast. To scan an entire 3Mb C > > buffer on my elderly 1.2 GHz Athlon takes 0.27s. That is why I > > suspect that the lisping in syntax-ppss might need severe > > optimisation. But again, it's only a hunch. > When I wrote syntax-ppss, my main goal was to never be significantly > slower than parse-partial-sexp. Even if it's not as fast as it could > be if written in C (which is pretty much obviously true), that's not a > reason to recode it in C. Surely the goal should be to be significantly faster most of the time. Presumably it achieves this in practice. The reason to recode in C would be to make it fast enough, or to couple it up to things which couldn't be done in lisp. But probably neither of these things is needed. > > What I think really needs doing is to make this function > > bulletproof: It should work on narrowed buffers, > That can be done, tho it needs extra info in order to know how to > interpret the fact that it's narrowed. Don't understand. The function is defined as the equivalent of (parse-partial-sexp (point-min) pos)? You've said before that the function is best not called when a buffer is narrowed. Couldn't we just redefine it as (parse-partial-sexp 1 pos)? Then we could just put (save-extension (widen ..... )) into syntax-ppss. [ .... ] > I think this will result in too many cache flushes and will make the > code too intrusive or too ad-hoc. I'd rather have a > syntax-ppss-syntax-table (and force parse-sexp-lookup-properties to t) > if you want more reliable results. Hey, syntax-ppss-syntax-table is a brilliant idea! In its doc string one could say "after setting this, clear the cache by calling ... (syntax-ppss-flush-cache 1)". > > Also, Lennart is asking for it to work nicely with multiple major modes. > > Surely this would be a Good Thing. Files containing several major modes > > are commonplace (awk or sed embedded within a shell script, html > > embedded within php, ....). > Yes, that's a desirable extension. > > At the moment, CC Mode applies a heuristic maximum size of strings and > > comments, for performance reasons. Checking for strings and comments is > > done so frequently that the mode uses elaborate internal caches. It > > would be nice if this cacheing could move to the Emacs core. > You can do it today. Have you even tried to use syntax-ppss before > asking for it to be improved? No. I think I've been scared by its vagueness (about narrowed regions) more than anything. It's defined in the elisp manual as equivalent to (pps (point-min) pos) rather than (pps 1 pos). It also uses syntax-begin-function, which doesn't seem right, and wouldn't work well in CC Mode; the only way s-b-f can give a cast-iron result is by calling parse-partial-sexp, or syntax-ppss. In fact, if syntax-ppss was bulletproof, syntax-begin-function would be redundant. I don't think syntax-ppss is quite the right function for what I want to do. I need something like it, but not identical. Maybe I should test syntax-ppss by coding up inside a macro which widens. And I've been less than convinced it's actually faster. In fact, I'll go and do some speed tests and report back. > > Again, this isn't something which can be implemented in a weekend, > > but I think it would be worthwhile for Emacs 24. > Other than the multi-major-mode part, it all sounds like very > minor changes. Maybe. > Stefan -- Alan Mackenzie (Nuremberg, Germany).