From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Patch for lookaround assertion in regexp Date: Tue, 24 Jan 2012 12:34:58 -0500 Message-ID: References: <009001ccd9c0$9bde09f0$d39a1dd0$@cfraizer.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1327426521 25212 80.91.229.12 (24 Jan 2012 17:35:21 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 24 Jan 2012 17:35:21 +0000 (UTC) Cc: emacs-devel@gnu.org, Colin Fraizer , t.matsuyama.pub@gmail.com To: Nikolai Weibull Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Jan 24 18:35:17 2012 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1RpkH7-0006yq-0a for ged-emacs-devel@m.gmane.org; Tue, 24 Jan 2012 18:35:17 +0100 Original-Received: from localhost ([::1]:46302 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RpkH6-0002Gl-D8 for ged-emacs-devel@m.gmane.org; Tue, 24 Jan 2012 12:35:16 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]:57829) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RpkH0-0002GS-58 for emacs-devel@gnu.org; Tue, 24 Jan 2012 12:35:14 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RpkGu-0003Mm-AK for emacs-devel@gnu.org; Tue, 24 Jan 2012 12:35:09 -0500 Original-Received: from pruche.dit.umontreal.ca ([132.204.246.22]:46924) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RpkGu-0003Ld-2n for emacs-devel@gnu.org; Tue, 24 Jan 2012 12:35:04 -0500 Original-Received: from faina.iro.umontreal.ca (lechon.iro.umontreal.ca [132.204.27.242]) by pruche.dit.umontreal.ca (8.14.1/8.14.1) with ESMTP id q0OHYxeI007888; Tue, 24 Jan 2012 12:34:59 -0500 Original-Received: by faina.iro.umontreal.ca (Postfix, from userid 20848) id 239A9B4431; Tue, 24 Jan 2012 12:34:59 -0500 (EST) In-Reply-To: (Nikolai Weibull's message of "Tue, 24 Jan 2012 16:09:36 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux) X-NAI-Spam-Flag: NO X-NAI-Spam-Threshold: 5 X-NAI-Spam-Score: 0 X-NAI-Spam-Rules: 1 Rules triggered RV4111=0 X-NAI-Spam-Version: 2.2.0.9309 : core <4111> : streams <722640> : uri <1052923> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 132.204.246.22 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:147884 Archived-At: >>> As an alternative to PCRE, which, as has already been pointed out, >>> doesn=E2=80=99t match any of these requirements, how about RE2? >>> http://code.google.com/p/re2/ >>> It=E2=80=99s written in C++, which is a minus, but it should be simple = enough >>> to extend it with \c and \s. >> That might work, indeed (tho someone still has to write the >> corresponding code). >> Note that it does not support lookaround assertions. > True, but you can, as far as I know, not do so without (allowing for) > exponential behavior. Actually, no. Contrary to backreferences (which are outside of the mathematical notion of regular expressions, and can't be matched in linear time), lookahead assertions are "normal". So RE2 may get support for lookahead assertions in the future (maybe for lookbehind as well, tho that's more difficult). > I don=E2=80=99t want to detract from the merits of lookaround assertions = (or > start a discussion on the subject), but I=E2=80=99ve always found them to= be a > sign of improper use of (no longer) regular expressions. I just pointed it out as supporting my argument that I'd rather not add lookaround assertions since it may make it more difficult to change to an linear-time matcher later on. Stefan