From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.bugs Subject: bug#192: regexp does not work as documented Date: Sun, 11 May 2008 21:43:47 -0400 Message-ID: References: <87k5i8ukq8.fsf@stupidchicken.com> <200805061335.11379.bruno@clisp.org> <48204B3D.6000500@gmx.at> <4826A303.3030002@gmx.at> <87abiwoqzd.fsf@stupidchicken.com> <482750F4.2050102@emf.net> Reply-To: Stefan Monnier , 192@emacsbugs.donarmstrong.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1210557133 16977 80.91.229.12 (12 May 2008 01:52:13 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 12 May 2008 01:52:13 +0000 (UTC) Cc: Chong Yidong , 192@emacsbugs.donarmstrong.com, emacs-devel@gnu.org, David Koppelman , Bruno Haible To: Thomas Lord Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon May 12 03:52:48 2008 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JvNDO-0001be-Tq for geb-bug-gnu-emacs@m.gmane.org; Mon, 12 May 2008 03:52:35 +0200 Original-Received: from localhost ([127.0.0.1]:33587 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JvNCg-0001kh-6B for geb-bug-gnu-emacs@m.gmane.org; Sun, 11 May 2008 21:51:50 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JvNCb-0001kN-Kk for bug-gnu-emacs@gnu.org; Sun, 11 May 2008 21:51:45 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JvNCa-0001jn-4C for bug-gnu-emacs@gnu.org; Sun, 11 May 2008 21:51:45 -0400 Original-Received: from [199.232.76.173] (port=41642 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JvNCa-0001jk-0A for bug-gnu-emacs@gnu.org; Sun, 11 May 2008 21:51:44 -0400 Original-Received: from rzlab.ucr.edu ([138.23.92.77]:50908) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JvNCZ-0007oa-5m for bug-gnu-emacs@gnu.org; Sun, 11 May 2008 21:51:43 -0400 Original-Received: from rzlab.ucr.edu (rzlab.ucr.edu [127.0.0.1]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id m4C1pdpJ014232; Sun, 11 May 2008 18:51:39 -0700 Original-Received: (from debbugs@localhost) by rzlab.ucr.edu (8.13.8/8.13.8/Submit) id m4C1o3vP013652; Sun, 11 May 2008 18:50:03 -0700 X-Loop: don@donarmstrong.com Resent-From: Stefan Monnier Resent-To: bug-submit-list@donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Mon, 12 May 2008 01:50:02 +0000 Resent-Message-ID: Resent-Sender: don@donarmstrong.com X-Emacs-PR-Message: report 192 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Original-Received: via spool by 192-submit@emacsbugs.donarmstrong.com id=B192.121055663712965 (code B ref 192); Mon, 12 May 2008 01:50:02 +0000 Original-Received: (at 192) by emacsbugs.donarmstrong.com; 12 May 2008 01:43:57 +0000 Original-Received: from ironport2-out.teksavvy.com (ironport2-out.teksavvy.com [206.248.154.182]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id m4C1hrQg012959 for <192@emacsbugs.donarmstrong.com>; Sun, 11 May 2008 18:43:54 -0700 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AmoCAIM9J0jO+JgrdGdsb2JhbACBU5A6ASeXEA X-IronPort-AV: E=Sophos;i="4.27,470,1204520400"; d="scan'208";a="20309136" Original-Received: from smtp.pppoe.ca (HELO smtp.teksavvy.com) ([65.39.196.238]) by ironport2-out.teksavvy.com with ESMTP; 11 May 2008 21:43:48 -0400 Original-Received: from pastel.home ([206.248.152.43]) by smtp.teksavvy.com (Internet Mail Server v1.0) with ESMTP id SGD37548; Sun, 11 May 2008 21:43:48 -0400 Original-Received: by pastel.home (Postfix, from userid 20848) id E689E7F83; Sun, 11 May 2008 21:43:47 -0400 (EDT) In-Reply-To: <482750F4.2050102@emf.net> (Thomas Lord's message of "Sun, 11 May 2008 13:03:00 -0700") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 3) Resent-Date: Sun, 11 May 2008 21:51:45 -0400 X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:17968 Archived-At: > Well, instead of using heuristics to decide where to re-scan from and > too, you can cache a record of where the DFA scan arrived at for > periodic positions in the buffer. Then begin scanning from just > before any modification for as far as it takes to arrive at a DFA > state that is the same as last time, updating any highlighting in the > region between those two points. That's a very good point. I'm not sure it's worth the trouble to store it at various buffer positions and check if it's EQ to stop the rescan, but at least we could match multiline expression one-line at a time. In any case, it's indeed a non-trivial amount of work because it probably requires rewriting not just font-lock but all the foo-mode-font-lock-keywords as well (font-lock-keywords are order dependent so you can't apply the rule nb 3 after rule nb 4). > I don't mean to imply that this is a trivial thing to implement in > Emacs but if you start getting up to building DFAs (very expensive in > the worst case) and taking intersections (very expensive in the worst > case) -- both also not all that simple to implement (nor obviously > possible for Emacs' extended regexp language) -- then the effort may > be comparable and (re-)visiting the option to adapt Rx to Emacs should > be worth considering. I have most of the DFA construction code written, but I may take you up on that anyway. BTW, regarding the "very expensive in the worst case", how common is this worst case in real life? Stefan