From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alexis Newsgroups: gmane.emacs.devel Subject: Re: Documentation on debugging regexp performance Date: Thu, 21 Jan 2016 20:39:42 +1100 Message-ID: <87y4bj9spt.fsf@gmail.com> References: <56A06CD6.2090707@gmail.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1453369224 26642 80.91.229.3 (21 Jan 2016 09:40:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 21 Jan 2016 09:40:24 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Jan 21 10:40:19 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aMBj1-0006ZS-MJ for ged-emacs-devel@m.gmane.org; Thu, 21 Jan 2016 10:40:19 +0100 Original-Received: from localhost ([::1]:46638 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aMBj0-0001M7-VR for ged-emacs-devel@m.gmane.org; Thu, 21 Jan 2016 04:40:18 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57672) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aMBig-0001AI-Eg for emacs-devel@gnu.org; Thu, 21 Jan 2016 04:40:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aMBid-0005EJ-9R for emacs-devel@gnu.org; Thu, 21 Jan 2016 04:39:58 -0500 Original-Received: from mail-pa0-x233.google.com ([2607:f8b0:400e:c03::233]:33935) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aMBid-0005Cz-2i for emacs-devel@gnu.org; Thu, 21 Jan 2016 04:39:55 -0500 Original-Received: by mail-pa0-x233.google.com with SMTP id uo6so21028266pac.1 for ; Thu, 21 Jan 2016 01:39:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:from:to:subject:in-reply-to:date:message-id:mime-version :content-type:content-transfer-encoding; bh=ph/PuZu2/w2a4vdVeyHlplh8Ej7g+EBv8zZu2YTl1V8=; b=Tiqi4/gCDj/jRPp5sCcOkAPuuTu8SE1yrbm36Ms+hd+T2P28NzuL24teeUqXAI7POl mRZJ5xgNMm4i2S0+pWr5DEmM4zDVkM4wClx85WBlv7Wc2d8LDACC3rxaVAiAa8WobOjO E15f95aynpcsx5j9edJc2wPYX+XGfTJH1GZ6Y/vjg9zsOx6HZLO8d8t7Rm2Q9W7mVB/c UTHTR+SvSt1+KSGseHTNyzv+b735u7q3hvxGSjU88FAX3Vy6URH6vvltiwapqhsONGXV XupW7RKhm/Hvqj4yQGFngUkCUfPZM9Apc3pWheZjt8fTndfvQAgcg4A2uKqnlwd93uEa p7Nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:references:from:to:subject:in-reply-to:date :message-id:mime-version:content-type:content-transfer-encoding; bh=ph/PuZu2/w2a4vdVeyHlplh8Ej7g+EBv8zZu2YTl1V8=; b=SfAHiWV9VI49hSetPdPUBmc9qL2IEe8j4ilxA6F5hWqTrXXAyVKi/YTcQfaPjLzfh5 vfew0vTzGQfsJZ9DEpDdoZRUkna9E4c2F5Z5fp1XZx07qc0iJc1TfCBPjZh6zHpYzi48 qV/c1FtvQdEusY9y5I0IVE2Gdqg71vbOvXXkDPFHLcYSOGn+qXzplYaUhypIrTBc55cj 1XVWj0ZGTfF4Fq2CgD6CGXncI3E5YLsgz2jNPEaVaikg/7McQ/qug9G3hsOmhlcl6exG uarsJ+ki/tNAFyRV9N5yG7aKOqPT76NubJNCTSKLiRvYUBH1SA6VKZjntQyEd2zEV2y5 rVwQ== X-Gm-Message-State: ALoCoQmcmpUlITkzfKnf5RMh1nBacZNFK3QfQx745PW8VL7x/pMKvPEfClXDd7FpDVVL0oV5CgZnuyOrq7RxFn5JI+98mO+4kg== X-Received: by 10.66.144.5 with SMTP id si5mr59239468pab.51.1453369186926; Thu, 21 Jan 2016 01:39:46 -0800 (PST) Original-Received: from localhost (CPE-58-161-15-29.cqqy2.win.bigpond.net.au. [58.161.15.29]) by smtp.gmail.com with ESMTPSA id y26sm1101059pfi.88.2016.01.21.01.39.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Jan 2016 01:39:46 -0800 (PST) In-reply-to: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400e:c03::233 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:198492 Archived-At: Yuri Khan writes: >> I'm running into a surprising regular expressions issue. I have >> attached a file (~50k) in which (re-search-forward " +[^:=]+ >> +:=?") seems to be extremely slow. (I killed it after 30 >> seconds). Truncating the file to its first 20 lines reduces the >> time for re-search-forward to about a second, which is still >> extremely slow. > > I’m no expert on the Emacs regexp implementation, but this part > is ambiguous: "[^:=]+ +". The engine will have to backtrack at > least once because the first part will greedily slurp all > spaces, then the second part will not match. You might want to > add the space to the exclusion character class: "[^:= ]+ +". More generally, i highly recommend Jeffrey Friedl's book "Mastering Regular Expressions". It's not Emacs-specific, but it provides in-depth explanations of why certain regexen are time- and/or space-hungry. Alexis.