From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: Progressively slow pattern match Date: Wed, 17 May 2006 19:53:57 +0000 (GMT) Message-ID: References: <87ejysbg5d.fsf@neutrino.caeruleus.net> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Trace: sea.gmane.org 1147896224 4633 80.91.229.2 (17 May 2006 20:03:44 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 17 May 2006 20:03:44 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 17 22:03:40 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FgSEu-0003K3-DB for ged-emacs-devel@m.gmane.org; Wed, 17 May 2006 22:03:24 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FgSEt-0003qZ-OG for ged-emacs-devel@m.gmane.org; Wed, 17 May 2006 16:03:23 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FgSEh-0003p1-Gz for emacs-devel@gnu.org; Wed, 17 May 2006 16:03:11 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FgSEf-0003n3-Ni for emacs-devel@gnu.org; Wed, 17 May 2006 16:03:11 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FgSEf-0003my-Kx for emacs-devel@gnu.org; Wed, 17 May 2006 16:03:09 -0400 Original-Received: from [193.149.49.134] (helo=acm.acm) by monty-python.gnu.org with esmtp (Exim 4.52) id 1FgSHa-0007lU-CX for emacs-devel@gnu.org; Wed, 17 May 2006 16:06:11 -0400 Original-Received: from localhost (root@localhost) by acm.acm (8.8.8/8.8.8) with SMTP id TAA00389; Wed, 17 May 2006 19:53:59 GMT X-Sender: root@acm.acm Original-To: Ralf Angeli In-Reply-To: <87ejysbg5d.fsf@neutrino.caeruleus.net> X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:54647 Archived-At: 'n Abend, Ralf! On Wed, 17 May 2006, Ralf Angeli wrote: >In AUCTeX there is a regexp used with `looking-at' where pattern >matching seems to progressively get slower the longer a part of the >(possible) match gets. I reduced the regexp to a bare minimum for >testing and the code now looks something like this: >(looking-at "\\(%+\\)*foo") >The problem occurs if this is used against a line with only % >characters in it. The more of these characters there are the slower >it gets. I checked the time one call of `looking-at' takes with >(abs (- (float-time) (progn (looking-at "\\(%+\\)*foo") (float-time)))) >and got the following results (in seconds): >%%%%%%%%%% 0.0006 >%%%%%%%%%%%%%%% 0.0154 >%%%%%%%%%%%%%%%%%%%% 0.5132 >%%%%%%%%%%%%%%%%%%%%%%%%% 7.8058 >The regexp is used with `looking-at' for checking if there are LaTeX >macros which have to be treated specially during paragraph movement. >As paragraph movement is used quite extensively when a region is to be >filled, users might get the notion that they are experiencing a hang >if they have such line for visually separating parts in the file. >Is this a deficiency in Emacs? Is there a way matching can be sped up >with this or maybe another, equivalent regexp? It's a bad regexp. It's got a sort of ambiguity, in that you've got both a "+" and a "*" on the "%" you want to match. The number of ways %+* can divide a row of n %s is the number of ordered partions of n[*]. The Emacs regexp engine tries out ALL of these, I think. Each time it comes to the non-% at the end, it goes back to try a different subdivision, to see if that gives a longer match. Evidently, each % you add doubles the time the regexp engine takes, approximately (though I suspect that for your last timining, the jump was only 4%, not 5%). And yes, I've done this too. :-( The solution is to write either %+ or %*, but not both together. [*] As in, 4 = 3 + 1 = 2 + 1 + 1 = 1 + 2 + 1 = 1 + 3 = 2 + 2 = 1 + 1 + 2 = 1 + 1 + 1 + 1 >Ralf -- Alan.