From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Philipp Stephani Newsgroups: gmane.emacs.devel Subject: Re: Fixing ill-conditioned regular expressions. Proof of concept. Date: Tue, 24 Feb 2015 06:20:05 +0000 Message-ID: References: <20150223181205.GA2861@acm.fritz.box> <54EB85AC.1030800@cs.ucla.edu> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a11c3cf2885676b050fcf823f X-Trace: ger.gmane.org 1424758816 1426 80.91.229.3 (24 Feb 2015 06:20:16 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 24 Feb 2015 06:20:16 +0000 (UTC) To: Paul Eggert , Alan Mackenzie , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Feb 24 07:20:15 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YQ8qr-0001MC-1x for ged-emacs-devel@m.gmane.org; Tue, 24 Feb 2015 07:20:13 +0100 Original-Received: from localhost ([::1]:46905 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ8qq-0003xU-8D for ged-emacs-devel@m.gmane.org; Tue, 24 Feb 2015 01:20:12 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41891) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ8qm-0003pr-7C for emacs-devel@gnu.org; Tue, 24 Feb 2015 01:20:09 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YQ8ql-0006CH-7Z for emacs-devel@gnu.org; Tue, 24 Feb 2015 01:20:08 -0500 Original-Received: from mail-lb0-x230.google.com ([2a00:1450:4010:c04::230]:42386) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ8qk-0006AW-Vm for emacs-devel@gnu.org; Tue, 24 Feb 2015 01:20:07 -0500 Original-Received: by lbiw7 with SMTP id w7so22893962lbi.9 for ; Mon, 23 Feb 2015 22:20:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:from:date:message-id:subject:to :content-type; bh=tWzO5FW9eyvMLlElLQh2Yp5Njk79jO8mvvupjKdj9Uo=; b=uns/wqsYNmvAyy0L1QglgEwqLKC9KCfdoLJAcHYzyjUz282DeApcVB57idTp08rtnX 6U4t0gc5pKU/YJP/GzQyCzjuBkJPpFLC6B7O4RhbPmpgoQ4sgtSwAlwpLXELatrQ02wX Bu0AwTBpnSddwt9R/z1ky8BjXQJdINhDT4wrnxkzcumgBU4SOmExf8gtHaNuYt1RWJ+K yQSlGuhp+cgq7CB0+KZ+vv8PLGnjPT1irhuWRtcStkvFspvgRYJmbkr1rrz3KTItdJpb nB2mo475GH5vDv9zZ2f3JhzJqitLOeMa3JNQ/u0qV0vr7JEmc/RLukaiSnYSoeiGMJF/ eBlg== X-Received: by 10.112.235.38 with SMTP id uj6mr1056944lbc.9.1424758805979; Mon, 23 Feb 2015 22:20:05 -0800 (PST) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2a00:1450:4010:c04::230 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:183439 Archived-At: --001a11c3cf2885676b050fcf823f Content-Type: text/plain; charset=UTF-8 Paul Eggert schrieb am Mon Feb 23 2015 at 20:55:54: > Would it be possible to fix the regular expression engine, so that > programs don't have to worry about parsing and reformulating regexps so > that they're "nice"? > > See http://swtch.com/~rsc/regexp/regexp1.html for a nice introduction into RE engines. It might be worthwhile to investigate the performance characteristics of using such a Thompson NFA in Emacs for regexes without back references. --001a11c3cf2885676b050fcf823f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Paul Eggert <eggert@cs.ucla.edu> schrieb am Mon Fe= b 23 2015 at 20:55:54:
Would it be possib= le to fix the regular expression engine, so that
programs don't have to worry about parsing and reformulating regexps so=
that they're "nice"?


See=C2=A0http://swtch.com/~rsc/regexp/regexp1.html for = a nice introduction into RE engines. It might be worthwhile to investigate = the performance characteristics of using such a=C2=A0Thompson NFA in Emacs = for regexes without back references.
--001a11c3cf2885676b050fcf823f--