From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.help Subject: Re: will we ever have zero width assertions in regexps? Date: Mon, 31 Jan 2011 11:08:29 -0500 Organization: A noiseless patient Spider Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1296492807 9663 80.91.229.12 (31 Jan 2011 16:53:27 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 31 Jan 2011 16:53:27 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon Jan 31 17:53:24 2011 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Pjx0E-0008Bm-8K for geh-help-gnu-emacs@m.gmane.org; Mon, 31 Jan 2011 17:53:22 +0100 Original-Received: from localhost ([127.0.0.1]:34126 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pjx0D-000501-LI for geh-help-gnu-emacs@m.gmane.org; Mon, 31 Jan 2011 11:53:21 -0500 Original-Path: usenet.stanford.edu!news.tele.dk!news.tele.dk!small.news.tele.dk!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!feeder.news-service.com!85.214.198.2.MISMATCH!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 42 Injection-Info: mx02.eternal-september.org; posting-host="xK1PaR9QACCMJMb7vrAh2Q"; logging-data="5008"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX182204QuszCGBelK/qdbRoX" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:bn8mIgCgLVVQ4rL67w+QCne2I7Q= sha1:9lHWwRgxYAtQtwtkK6ziw5Nz6fs= Original-Xref: usenet.stanford.edu gnu.emacs.help:184719 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:78879 Archived-At: >> A typical case could look something like "foo *(.*?) *bar". when >> matching "foo .... baZ". > No, this is a polynomial-time problem. My optimization does nothing > for such cases. And I do not think such a REx would provide any > problem in real life - unless you have many hundreds of consecutive > spaces. Such problems tend to show up (in hard to fix ways that is) with regexps that are built in pieces (e.g. by combining existing regexps like comment-start-skip and paragraph-start or things like that). And yes, these tend to work just fine in practice, which is why they end up in real code, and then a couple years later someone complains that Emacs freezes when he opens his funny file with some odd long line. > (And unless Emacs' REx engine is particularly slow per OPCODE.) Emacs's REx engine isn't particularly fast, I think, but I don't think it's the problem. > But I start to see the difference - it is in usage scenarios. Probably. > Many Perl REx matches are done "per-line", not "per-file". That's one difference. Another is that many regexps are used all the time without the user explicitly asking for it, and on text which we assume takes a particular shape, even though it may take a completely different form (e.g. regexps used for the *compile* buffer). > match-with-continuation. An interesting idea. I already implemented > it for Perl (to support (??{}), but it is not exposed to the user. > Would one want this in non-interactive situations? I can't think of interactive uses, but I'd like to try and use it for to let font-lock find elements that span several lines, even when it works one-line-at-a-time. Stefan