From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: tomas@tuxteam.de
Newsgroups: gmane.emacs.devel
Subject: Re: Structural regular expressions
Date: Sat, 11 Sep 2010 10:33:18 +0200
Message-ID: <20100911083318.GA10266@tomas>
References: <pvhphbi0wq0d.fsf@gmx.li> <jwvlj7c9ura.fsf-monnier+emacs@gnu.org>
	<46875.130.55.118.19.1284065220.squirrel@webmail.lanl.gov>
	<AANLkTimUS7zL77TGiWoEdS+=nuww=TSABKMZuSiYPaCc@mail.gmail.com>
	<E1Ou5lY-0006Jj-MB@fencepost.gnu.org>
	<AANLkTi=dv8n40x-rTtz@mail.gmail.com>
	<loom.20100910T221237-941@post.gmane.org>
	<5C7E009338A34E35BB58F0C877A8AD9E@us.oracle.com>
	<87iq2dt3w0.fsf@catnip.gol.com> <87pqwkyann.fsf@gmail.com>
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: quoted-printable
X-Trace: dough.gmane.org 1284193868 18405 80.91.229.12 (11 Sep 2010 08:31:08 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Sat, 11 Sep 2010 08:31:08 +0000 (UTC)
Cc: emacs-devel@gnu.org, Miles Bader <miles@gnu.org>
To: Wojciech Meyer <wojciech.meyer@googlemail.com>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Sep 11 10:31:06 2010
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1OuLUH-0007r9-Pn
	for ged-emacs-devel@m.gmane.org; Sat, 11 Sep 2010 10:31:06 +0200
Original-Received: from localhost ([127.0.0.1]:34615 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1OuLUG-0008Rf-SL
	for ged-emacs-devel@m.gmane.org; Sat, 11 Sep 2010 04:31:04 -0400
Original-Received: from [140.186.70.92] (port=40571 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1OuLU8-0008Q9-I9
	for emacs-devel@gnu.org; Sat, 11 Sep 2010 04:30:57 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <tomas@tuxteam.de>) id 1OuLU7-00063K-DR
	for emacs-devel@gnu.org; Sat, 11 Sep 2010 04:30:56 -0400
Original-Received: from alextrapp1.equinoxe.de ([217.22.192.104]:54205
	helo=www.elogos.de) by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <tomas@tuxteam.de>)
	id 1OuLU7-000634-88; Sat, 11 Sep 2010 04:30:55 -0400
Original-Received: by www.elogos.de (Postfix, from userid 1000)
	id C384290061; Sat, 11 Sep 2010 10:33:18 +0200 (CEST)
Content-Disposition: inline
In-Reply-To: <87pqwkyann.fsf@gmail.com>
User-Agent: Mutt/1.5.15+20070412 (2007-04-11)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:129956
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/129956>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Sep 11, 2010 at 08:58:04AM +0100, Wojciech Meyer wrote:
> Miles Bader <miles@gnu.org> writes:
>=20
> > "Drew Adams" <drew.adams@Oracle.Com> writes:
> >> That's the real point, I believe: the paper touts the use of regexps
> >> to divide text into chunks that match - chunks that are not
> >> necessarily lines, in order to then act on those chunks in some way.
> >
> > Not a good base, I think -- regexps are not really powerful enough to=
 do
> > the job well.
>=20
> Yes regexp are quite limited.
> Maybe a simple PEG parser based on packrat, with a syntax sugar for
> defining one line set of rules?

While PEG is interesting in itself (and I think Emacs should have
something like that, just to test its strengths/weaknesses wrt regex), I
think Drew is right: A way, *any* way to define a "buffer subset", maybe
partitioned into "chunks" is useful here. So at this level, I'd think
concentrating on interface design (user & programmer) makes most sense,
abstracting from possible implementations (regex, peg, font-lock,
hand-built parser).

The (possible) implementations should (I think) just guide the design of
the interfaces (as examples). In the ideal case, it should be possible
to use whatever implementation is most helpful (or combine them: union,
intersection, symmetric difference).

Just dreaming?

Regards
- -- tom=C3=A1s
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFMiz7OBcgs9XrR2kYRAjZvAJ9Hzc4Dk2Z4t3wohMQJX/8544MvIQCffrxr
WKNM0E3e/fJ3UF61J4Ez7c4=3D
=3DtDCG
-----END PGP SIGNATURE-----