From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.help Subject: Re: will we ever have zero width assertions in regexps? Date: Mon, 07 Feb 2011 15:30:25 -0500 Organization: A noiseless patient Spider Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1297118468 6874 80.91.229.12 (7 Feb 2011 22:41:08 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 7 Feb 2011 22:41:08 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon Feb 07 23:41:04 2011 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PmZlX-0005dO-S6 for geh-help-gnu-emacs@m.gmane.org; Mon, 07 Feb 2011 23:41:04 +0100 Original-Received: from localhost ([127.0.0.1]:59877 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PmZlY-0004Xg-OQ for geh-help-gnu-emacs@m.gmane.org; Mon, 07 Feb 2011 17:41:04 -0500 Original-Path: usenet.stanford.edu!goblin1!goblin.stu.neva.ru!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 37 Injection-Info: mx02.eternal-september.org; posting-host="amKqFoJRJ7Hda9JgZt0dlw"; logging-data="3085"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/jVvmSBwFevdya/8mBWEa5" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:S98JemtVbfWY/OQcirsREGzPfg8= sha1:MvToOibiLpOnkICswo1b8l2Oby0= Original-Xref: usenet.stanford.edu gnu.emacs.help:184846 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:79013 Archived-At: >>> So you have a REx which is matched against a line, but you want (in >>> addition to the usual effects of matching) to know whether it "wanted" >>> the match to overflow into the following line? >> >>> If so, it looks like "reusing the continuation state" would not be a >>> serious optimization - it would add just a small multiplicative >>> constant to the "use only the hypothetical bit" scenario... >> >> We could probably make it work with just that extra bit, indeed. >> But with the full intermediate state, we get to just "start the search >> with last line's state" instead of having to "start the search from the >> previous N lines since they all ended with the bit set", so >> it will happily work with many-lines cases without having to reparse >> those many lines N times. > Hmm, I thought about a different scenario: if the bit is set, then one > switches to a DIFFERENT REx designed for a multi-line case. Otherwise > why not just run it against the rest of the buffer, instead of > one-line? Because we don't want to match those regexps against the whole buffer every time the buffer is modified (the buffer may be large). Also it can be tricky to match only some of the font-lock regexps against the whole buffer, since font-lock-keywords is normally defined with the assumption that the regexps are applied in turn, that earlier ones prevent subsequent ones from being applied and that we start with a fresh un-highlighted buffer. So in general, if we want to apply one of the font-lock-keywords to the whole buffer, we have to do it for all of them. BTW, another reason to want a non-backtracking matcher can be seen in the recent thread "Stack overflow in regexp matcher". Stefan