From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: Fixing ill-conditioned regular expressions. Proof of concept. Date: Mon, 23 Feb 2015 22:42:45 +0000 Message-ID: <20150223224245.GC2861@acm.fritz.box> References: <20150223181205.GA2861@acm.fritz.box> <54EB85AC.1030800@cs.ucla.edu> <20150223202114.GB2861@acm.fritz.box> <54EBA757.5030901@cs.ucla.edu> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1424731415 19334 80.91.229.3 (23 Feb 2015 22:43:35 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 23 Feb 2015 22:43:35 +0000 (UTC) Cc: emacs-devel@gnu.org To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Feb 23 23:43:28 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YQ1ip-0000WH-Ho for ged-emacs-devel@m.gmane.org; Mon, 23 Feb 2015 23:43:27 +0100 Original-Received: from localhost ([::1]:45773 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ1io-000586-O9 for ged-emacs-devel@m.gmane.org; Mon, 23 Feb 2015 17:43:26 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:50814) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ1ih-00057l-Io for emacs-devel@gnu.org; Mon, 23 Feb 2015 17:43:23 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YQ1ic-0000kW-Fm for emacs-devel@gnu.org; Mon, 23 Feb 2015 17:43:19 -0500 Original-Received: from colin.muc.de ([193.149.48.1]:41519 helo=mail.muc.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ1ic-0000kQ-5p for emacs-devel@gnu.org; Mon, 23 Feb 2015 17:43:14 -0500 Original-Received: (qmail 38316 invoked by uid 3782); 23 Feb 2015 22:43:11 -0000 Original-Received: from acm.muc.de (pD951909A.dip0.t-ipconnect.de [217.81.144.154]) by colin.muc.de (tmda-ofmipd) with ESMTP; Mon, 23 Feb 2015 23:43:10 +0100 Original-Received: (qmail 4950 invoked by uid 1000); 23 Feb 2015 22:42:45 -0000 Content-Disposition: inline In-Reply-To: <54EBA757.5030901@cs.ucla.edu> User-Agent: Mutt/1.5.22 (2013-10-16) X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 8.x X-Received-From: 193.149.48.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:183432 Archived-At: Hi, Paul. On Mon, Feb 23, 2015 at 02:19:03PM -0800, Paul Eggert wrote: > On 02/23/2015 12:21 PM, Alan Mackenzie wrote: > > basically, I've got little idea about regexp engines. > That's OK, if you prefer a source-to-source transformation then you can > use that instead, but the point is that this should be done for all uses > of the regexp code, not just for some of them. Brilliant idea! Why not call fix-re from within re-search-forward/backward, looking-at, ... With its cache, the extra runtime will be negligible. After a bit of tidying up, debugging, handling of \{..\}, proper testing, .... > The Emacs regexp code isn't Perl-inspired, as far as I know. It's an > old copy of the glibc code, with a lot of hacks. The glibc version > mutated quite a bit when it added i18n support, and Emacs's version has > mutated in different ways. Ah, right. -- Alan Mackenzie (Nuremberg, Germany).