From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: "Drew Adams" <drew.adams@oracle.com>
Newsgroups: gmane.emacs.devel
Subject: RE: Structural regular expressions
Date: Tue, 7 Sep 2010 17:00:20 -0700
Message-ID: <32DE0BE5095D4030972861C7974A01B9@us.oracle.com>
References: <loom.20100907T212314-566@post.gmane.org>
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Trace: dough.gmane.org 1283904169 24112 80.91.229.12 (8 Sep 2010 00:02:49 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Wed, 8 Sep 2010 00:02:49 +0000 (UTC)
To: "'Tom'" <levelhalom@gmail.com>, <emacs-devel@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Sep 08 02:02:48 2010
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1Ot86z-000669-9k
	for ged-emacs-devel@m.gmane.org; Wed, 08 Sep 2010 02:02:45 +0200
Original-Received: from localhost ([127.0.0.1]:51956 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1Ot86t-0005ES-UV
	for ged-emacs-devel@m.gmane.org; Tue, 07 Sep 2010 20:01:55 -0400
Original-Received: from [140.186.70.92] (port=43222 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Ot86n-0005EK-4g
	for emacs-devel@gnu.org; Tue, 07 Sep 2010 20:01:50 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <drew.adams@oracle.com>) id 1Ot86l-000811-Hl
	for emacs-devel@gnu.org; Tue, 07 Sep 2010 20:01:48 -0400
Original-Received: from rcsinet10.oracle.com ([148.87.113.121]:51212)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <drew.adams@oracle.com>) id 1Ot86l-00080n-Bh
	for emacs-devel@gnu.org; Tue, 07 Sep 2010 20:01:47 -0400
Original-Received: from rcsinet13.oracle.com (rcsinet13.oracle.com [148.87.113.125])
	by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id
	o8801gfM028867
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Wed, 8 Sep 2010 00:01:43 GMT
Original-Received: from acsmt354.oracle.com (acsmt354.oracle.com [141.146.40.154])
	by rcsinet13.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id
	o87JWs9x025628; Wed, 8 Sep 2010 00:01:41 GMT
Original-Received: from abhmt003.oracle.com by acsmt353.oracle.com
	with ESMTP id 585769811283904032; Tue, 07 Sep 2010 17:00:32 -0700
Original-Received: from dradamslap1 (/10.159.240.253)
	by default (Oracle Beehive Gateway v4.0)
	with ESMTP ; Tue, 07 Sep 2010 17:00:31 -0700
X-Mailer: Microsoft Office Outlook 11
Thread-Index: ActOwvYeocbC4gjwToG9hvqZEODgIAABDCFg
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5931
In-Reply-To: <loom.20100907T212314-566@post.gmane.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:129760
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/129760>

> This structural regex thing is interesting. You can perform operations
> (e.g. replace text) on all strings in the file, or everywhere except
> in strings and comments, etc. Here's the description of the feature
> on the E editor blog if someone wants to implement something like
> this for emacs: http://e-texteditor.com/blog/2010/beyond-vi
 
FWIW -

Not to pretend that this is exactly the same thing, but you can use Icicles to
do that.  A similar approach could be adopted by vanilla Emacs.

Icicles can use a text property to identify the parts of the buffer to search.
Those parts then act as completion candidates that you can match using a regexp
or other pattern (which you can change on the fly, to dynamically filter the
search hits).

Font-locking already provides such labeling-using-a-property, for free. It was
designed with another purpose in mind, of course, so the buffer parts identified
by font-lock might not always be those most pertinent for the job at hand.
Depends on just what "structures" you need - those provided by font-lock are
pretty basic.

Anyway, as an example, using the identification provided by font-lock, you can
use `C-c "' (`icicle-search-text-property') to search (e.g. using regexps) among
only the strings or only the comments, etc. of a buffer (or of multiple buffers
or files) - based on their different font-lock faces. (You cannot, however,
search among the complementary parts - e.g. the non-comments, without defining a
new Icicles search command.)

Font-lock faces can be used this way to do what you describe, provided the
"structural" parts of the buffer you are interested in are font-locked using
different faces.

This feature does not depend on font-lock, however.  The text property that is
used to divide the buffer into searchable parts need not be `face' - any
property will do.  So if you have a function that parses buffer parts (code
structures) in a more meaningful way (in some sense) than font-locking does, it
can add a text property with different values to identify the parts, and Icicles
search can exploit that labeling immediately.

And the property could be an overlay property instead of a text property.  And
you can replace matches while you search, on-demand.  And you could easily
define a specialized search command that allows other actions besides
replacement (e.g. a popup menu of alternative actions).

http://www.emacswiki.org/emacs/Icicles_-_Other_Search_Commands#toc2

In addition, it looks like the "structure" described in the blog post you cited
is in fact defined just by a set of regexp matches (but I'm no expert on reading
vi-ese):

Y/^\n/ V%A.*Pike<enter> \ V|^%T

It looks as though a few simple patterns do the trick to select the target
lines, for the example given.  If true, then for that simple kind of structure
definition you can just use ordinary Icicles search - no need for any fancy
(non-regexp) parsing or the application of a text property.  Ordinary Icicles
search (like the text-property search) lets you combine the filtering of any
number of input patterns (e.g. regexps).

And if you have a hairy pattern or set of patterns that you want to reuse,
instead of typing it interactively each time (as would seem to be the case for
the bibtex/refer references, though the blob touts the "effortlessness" of
typing such incantations), then you can define a command that incorporates that
info for the initial Icicle-search parse.

`C-c =' (`icicle-imenu') does that, for instance: it just passes the hairy imenu
regexp to `icicle-search'.  Any additional, dynamic pattern you then type just
filters the imenu candidates (e.g. function definitions).