From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: regexp does not work as documented Date: Sun, 11 May 2008 21:43:47 -0400 Message-ID: References: <87k5i8ukq8.fsf@stupidchicken.com> <200805061335.11379.bruno@clisp.org> <48204B3D.6000500@gmx.at> <4826A303.3030002@gmx.at> <87abiwoqzd.fsf@stupidchicken.com> <482750F4.2050102@emf.net> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1210556653 15970 80.91.229.12 (12 May 2008 01:44:13 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 12 May 2008 01:44:13 +0000 (UTC) Cc: Chong Yidong , 192@emacsbugs.donarmstrong.com, emacs-devel@gnu.org, martin rudalics , David Koppelman , Bruno Haible To: Thomas Lord Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 12 03:44:43 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JvN5k-0008LU-Gp for ged-emacs-devel@m.gmane.org; Mon, 12 May 2008 03:44:40 +0200 Original-Received: from localhost ([127.0.0.1]:53635 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JvN51-0007zI-OZ for ged-emacs-devel@m.gmane.org; Sun, 11 May 2008 21:43:55 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JvN4x-0007zD-NN for emacs-devel@gnu.org; Sun, 11 May 2008 21:43:51 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JvN4v-0007z1-7p for emacs-devel@gnu.org; Sun, 11 May 2008 21:43:50 -0400 Original-Received: from [199.232.76.173] (port=59181 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JvN4v-0007yy-2W for emacs-devel@gnu.org; Sun, 11 May 2008 21:43:49 -0400 Original-Received: from ironport2-out.teksavvy.com ([206.248.154.182]:3152) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JvN4u-0006RN-SD for emacs-devel@gnu.org; Sun, 11 May 2008 21:43:48 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AmoCAIM9J0jO+JgrdGdsb2JhbACBU5A6ASeXEA X-IronPort-AV: E=Sophos;i="4.27,470,1204520400"; d="scan'208";a="20309136" Original-Received: from smtp.pppoe.ca (HELO smtp.teksavvy.com) ([65.39.196.238]) by ironport2-out.teksavvy.com with ESMTP; 11 May 2008 21:43:48 -0400 Original-Received: from pastel.home ([206.248.152.43]) by smtp.teksavvy.com (Internet Mail Server v1.0) with ESMTP id SGD37548; Sun, 11 May 2008 21:43:48 -0400 Original-Received: by pastel.home (Postfix, from userid 20848) id E689E7F83; Sun, 11 May 2008 21:43:47 -0400 (EDT) In-Reply-To: <482750F4.2050102@emf.net> (Thomas Lord's message of "Sun, 11 May 2008 13:03:00 -0700") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) X-detected-kernel: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:96999 Archived-At: > Well, instead of using heuristics to decide where to re-scan from and > too, you can cache a record of where the DFA scan arrived at for > periodic positions in the buffer. Then begin scanning from just > before any modification for as far as it takes to arrive at a DFA > state that is the same as last time, updating any highlighting in the > region between those two points. That's a very good point. I'm not sure it's worth the trouble to store it at various buffer positions and check if it's EQ to stop the rescan, but at least we could match multiline expression one-line at a time. In any case, it's indeed a non-trivial amount of work because it probably requires rewriting not just font-lock but all the foo-mode-font-lock-keywords as well (font-lock-keywords are order dependent so you can't apply the rule nb 3 after rule nb 4). > I don't mean to imply that this is a trivial thing to implement in > Emacs but if you start getting up to building DFAs (very expensive in > the worst case) and taking intersections (very expensive in the worst > case) -- both also not all that simple to implement (nor obviously > possible for Emacs' extended regexp language) -- then the effort may > be comparable and (re-)visiting the option to adapt Rx to Emacs should > be worth considering. I have most of the DFA construction code written, but I may take you up on that anyway. BTW, regarding the "very expensive in the worst case", how common is this worst case in real life? Stefan