From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: tomas@tuxteam.de Newsgroups: gmane.emacs.devel Subject: Re: Structural regular expressions Date: Sat, 11 Sep 2010 10:33:18 +0200 Message-ID: <20100911083318.GA10266@tomas> References: <pvhphbi0wq0d.fsf@gmx.li> <jwvlj7c9ura.fsf-monnier+emacs@gnu.org> <46875.130.55.118.19.1284065220.squirrel@webmail.lanl.gov> <AANLkTimUS7zL77TGiWoEdS+=nuww=TSABKMZuSiYPaCc@mail.gmail.com> <E1Ou5lY-0006Jj-MB@fencepost.gnu.org> <AANLkTi=dv8n40x-rTtz@mail.gmail.com> <loom.20100910T221237-941@post.gmane.org> <5C7E009338A34E35BB58F0C877A8AD9E@us.oracle.com> <87iq2dt3w0.fsf@catnip.gol.com> <87pqwkyann.fsf@gmail.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1284193868 18405 80.91.229.12 (11 Sep 2010 08:31:08 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 11 Sep 2010 08:31:08 +0000 (UTC) Cc: emacs-devel@gnu.org, Miles Bader <miles@gnu.org> To: Wojciech Meyer <wojciech.meyer@googlemail.com> Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Sep 11 10:31:06 2010 Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org> Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>) id 1OuLUH-0007r9-Pn for ged-emacs-devel@m.gmane.org; Sat, 11 Sep 2010 10:31:06 +0200 Original-Received: from localhost ([127.0.0.1]:34615 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OuLUG-0008Rf-SL for ged-emacs-devel@m.gmane.org; Sat, 11 Sep 2010 04:31:04 -0400 Original-Received: from [140.186.70.92] (port=40571 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OuLU8-0008Q9-I9 for emacs-devel@gnu.org; Sat, 11 Sep 2010 04:30:57 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from <tomas@tuxteam.de>) id 1OuLU7-00063K-DR for emacs-devel@gnu.org; Sat, 11 Sep 2010 04:30:56 -0400 Original-Received: from alextrapp1.equinoxe.de ([217.22.192.104]:54205 helo=www.elogos.de) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from <tomas@tuxteam.de>) id 1OuLU7-000634-88; Sat, 11 Sep 2010 04:30:55 -0400 Original-Received: by www.elogos.de (Postfix, from userid 1000) id C384290061; Sat, 11 Sep 2010 10:33:18 +0200 (CEST) Content-Disposition: inline In-Reply-To: <87pqwkyann.fsf@gmail.com> User-Agent: Mutt/1.5.15+20070412 (2007-04-11) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." <emacs-devel.gnu.org> List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>, <mailto:emacs-devel-request@gnu.org?subject=unsubscribe> List-Archive: <http://lists.gnu.org/archive/html/emacs-devel> List-Post: <mailto:emacs-devel@gnu.org> List-Help: <mailto:emacs-devel-request@gnu.org?subject=help> List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>, <mailto:emacs-devel-request@gnu.org?subject=subscribe> Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129956 Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/129956> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sat, Sep 11, 2010 at 08:58:04AM +0100, Wojciech Meyer wrote: > Miles Bader <miles@gnu.org> writes: >=20 > > "Drew Adams" <drew.adams@Oracle.Com> writes: > >> That's the real point, I believe: the paper touts the use of regexps > >> to divide text into chunks that match - chunks that are not > >> necessarily lines, in order to then act on those chunks in some way. > > > > Not a good base, I think -- regexps are not really powerful enough to= do > > the job well. >=20 > Yes regexp are quite limited. > Maybe a simple PEG parser based on packrat, with a syntax sugar for > defining one line set of rules? While PEG is interesting in itself (and I think Emacs should have something like that, just to test its strengths/weaknesses wrt regex), I think Drew is right: A way, *any* way to define a "buffer subset", maybe partitioned into "chunks" is useful here. So at this level, I'd think concentrating on interface design (user & programmer) makes most sense, abstracting from possible implementations (regex, peg, font-lock, hand-built parser). The (possible) implementations should (I think) just guide the design of the interfaces (as examples). In the ideal case, it should be possible to use whatever implementation is most helpful (or combine them: union, intersection, symmetric difference). Just dreaming? Regards - -- tom=C3=A1s -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFMiz7OBcgs9XrR2kYRAjZvAJ9Hzc4Dk2Z4t3wohMQJX/8544MvIQCffrxr WKNM0E3e/fJ3UF61J4Ez7c4=3D =3DtDCG -----END PGP SIGNATURE-----