From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Idea for syntax-ppss. Is it new? Could it be any good? Date: Sun, 27 Jul 2008 11:51:36 -0400 Message-ID: References: <20080726214429.GB3623@muc.de> <20080727145058.GA1598@muc.de> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1217173914 28310 80.91.229.12 (27 Jul 2008 15:51:54 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 27 Jul 2008 15:51:54 +0000 (UTC) Cc: emacs-devel@gnu.org To: Alan Mackenzie Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Jul 27 17:52:43 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KN8Y5-0004aL-KZ for ged-emacs-devel@m.gmane.org; Sun, 27 Jul 2008 17:52:41 +0200 Original-Received: from localhost ([127.0.0.1]:47608 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KN8XB-0007Wm-M3 for ged-emacs-devel@m.gmane.org; Sun, 27 Jul 2008 11:51:45 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KN8X6-0007W8-Nn for emacs-devel@gnu.org; Sun, 27 Jul 2008 11:51:40 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KN8X5-0007VR-11 for emacs-devel@gnu.org; Sun, 27 Jul 2008 11:51:40 -0400 Original-Received: from [199.232.76.173] (port=51657 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KN8X4-0007VO-Uf for emacs-devel@gnu.org; Sun, 27 Jul 2008 11:51:38 -0400 Original-Received: from ironport2-out.teksavvy.com ([206.248.154.182]:48857) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KN8X4-0000uW-IC for emacs-devel@gnu.org; Sun, 27 Jul 2008 11:51:38 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AisFAMs1jEhFxIYw/2dsb2JhbACBW4k+oXmBdA X-IronPort-AV: E=Sophos;i="4.31,259,1215403200"; d="scan'208";a="24858161" Original-Received: from 69-196-134-48.dsl.teksavvy.com (HELO pastel.home) ([69.196.134.48]) by ironport2-out.teksavvy.com with ESMTP; 27 Jul 2008 11:51:37 -0400 Original-Received: by pastel.home (Postfix, from userid 20848) id E5FE67FE3; Sun, 27 Jul 2008 11:51:36 -0400 (EDT) In-Reply-To: <20080727145058.GA1598@muc.de> (Alan Mackenzie's message of "Sun, 27 Jul 2008 14:50:58 +0000") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) X-detected-kernel: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:101580 Archived-At: >> Isn't that what syntax-ppss does? > It caches the state for several positions, but I don't think they're at > regular positions. C-h v syntax-ppss-max-span It's not exactly perfectly regular, but I don't think the difference matters. > I don't understand the detailed workings of the routine at the moment. > I suspect that the slowness of all the lisp manipulation will outweigh > the benefit of the caching, but I would confirm or refute that with > the profiler before doing anything serious. > partial-parse-sexp is blindingly fast. To scan an entire 3Mb C buffer > on my elderly 1.2 GHz Athlon takes 0.27s. That is why I suspect that > the lisping in syntax-ppss might need severe optimisation. But again, > it's only a hunch. When I wrote syntax-ppss, my main goal was to never be significantly slower than parse-partial-sexp. Even if it's not as fast as it could be if written in C (which is pretty much obviously true), that's not a reason to recode it in C. > What I think really needs doing is to make this function bulletproof: It > should work on narrowed buffers, That can be done, tho it needs extra info in order to know how to interpret the fact that it's narrowed. > it should give reliable elements 2 and 6, If you really care about them, then I recommend you fix it in parse-partial-sexp. > its cache should be cleared when functions like `modify-syntax-entry' > are called or parse-sexp-lookup-properties is changed, and the cache > should be bound to nil on `with-syntax-table'. I actually think it > could be useful to maintain several parallel caches, each for a > different syntax-table (or an equivalence class of syntax tables). And > so on. Basically, I would like `(syntax-ppss)' to tell me with 100% > reliability, no ifs, no buts, whether I am at top-level, in a comment, > or in a string. I think this will result in too many cache flushes and will make the code too intrusive or too ad-hoc. I'd rather have a syntax-ppss-syntax-table (and force parse-sexp-lookup-properties to t) if you want more reliable results. > Also, Lennart is asking for it to work nicely with multiple major modes. > Surely this would be a Good Thing. Files containing several major modes > are commonplace (awk or sed embedded within a shell script, html > embedded within php, ....). Yes, that's a desirable extension. > At the moment, CC Mode applies a heuristic maximum size of strings and > comments, for performance reasons. Checking for strings and comments is > done so frequently that the mode uses elaborate internal caches. It > would be nice if this cacheing could move to the Emacs core. You can do it today. Have you even tried to use syntax-ppss before asking for it to be improved? > Again, this isn't something which can be implemented in a weekend, but I > think it would be worthwhile for Emacs 24. Other than the multi-major-mode part, it all sounds like very minor changes. Stefan