From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Davis Herring" <herring@lanl.gov> Newsgroups: gmane.emacs.devel Subject: Re: Structural regular expressions Date: Thu, 9 Sep 2010 13:47:00 -0700 (PDT) Message-ID: <46875.130.55.118.19.1284065220.squirrel@webmail.lanl.gov> References: <loom.20100907T212314-566@post.gmane.org> <AANLkTimYvE0aqrG-OQxuY6BTca7ngzrfQUa62mOxyV=+@mail.gmail.com> <loom.20100907T222143-475@post.gmane.org> <87sk1lt4uf.fsf@gmail.com> <jwvsk1kaav2.fsf-monnier+emacs@gnu.org> <pvhphbi0wq0d.fsf@gmx.li> <jwvlj7c9ura.fsf-monnier+emacs@gnu.org> Reply-To: herring@lanl.gov NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: dough.gmane.org 1284065239 9746 80.91.229.12 (9 Sep 2010 20:47:19 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 9 Sep 2010 20:47:19 +0000 (UTC) Cc: Lawrence Mitchell <wence@gmx.li>, emacs-devel@gnu.org To: "Stefan Monnier" <monnier@iro.umontreal.ca> Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Sep 09 22:47:18 2010 Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org> Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>) id 1Oto1c-0008Fb-7h for ged-emacs-devel@m.gmane.org; Thu, 09 Sep 2010 22:47:16 +0200 Original-Received: from localhost ([127.0.0.1]:49250 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Oto1b-0008OR-L0 for ged-emacs-devel@m.gmane.org; Thu, 09 Sep 2010 16:47:15 -0400 Original-Received: from [140.186.70.92] (port=38693 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Oto1T-0008No-Qm for emacs-devel@gnu.org; Thu, 09 Sep 2010 16:47:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from <herring@lanl.gov>) id 1Oto1S-0004RL-HZ for emacs-devel@gnu.org; Thu, 09 Sep 2010 16:47:07 -0400 Original-Received: from proofpoint2.lanl.gov ([204.121.3.26]:49525) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from <herring@lanl.gov>) id 1Oto1S-0004R2-9X for emacs-devel@gnu.org; Thu, 09 Sep 2010 16:47:06 -0400 Original-Received: from mailrelay2.lanl.gov (mailrelay2.lanl.gov [128.165.4.103]) by proofpoint2.lanl.gov (8.14.3/8.14.3) with ESMTP id o89LP9EE023480; Thu, 9 Sep 2010 15:25:09 -0600 Original-Received: from localhost (localhost.localdomain [127.0.0.1]) by mailrelay2.lanl.gov (Postfix) with ESMTP id 39F991A8995E; Thu, 9 Sep 2010 14:47:01 -0600 (MDT) X-NIE-2-Virus-Scanner: amavisd-new at mailrelay2.lanl.gov Original-Received: from webmail1.lanl.gov (webmail1.lanl.gov [128.165.4.106]) by mailrelay2.lanl.gov (Postfix) with ESMTP id 1C8651A8994A; Thu, 9 Sep 2010 14:47:01 -0600 (MDT) Original-Received: by webmail1.lanl.gov (Postfix, from userid 48) id 1A2441CA82DE; Thu, 9 Sep 2010 14:47:00 -0600 (MDT) Original-Received: from 130.55.118.19 (SquirrelMail authenticated user 196434) by webmail.lanl.gov with HTTP; Thu, 9 Sep 2010 13:47:00 -0700 (PDT) In-Reply-To: <jwvlj7c9ura.fsf-monnier+emacs@gnu.org> User-Agent: SquirrelMail/1.4.8-5.el5_4.10.lanl3 X-Priority: 3 (Normal) Importance: Normal X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.0.10011, 1.0.148, 0.0.0000 definitions=2010-09-09_11:2010-09-09, 2010-09-09, 1970-01-01 signatures=0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." <emacs-devel.gnu.org> List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>, <mailto:emacs-devel-request@gnu.org?subject=unsubscribe> List-Archive: <http://lists.gnu.org/archive/html/emacs-devel> List-Post: <mailto:emacs-devel@gnu.org> List-Help: <mailto:emacs-devel-request@gnu.org?subject=help> List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>, <mailto:emacs-devel-request@gnu.org?subject=subscribe> Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129834 Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/129834> > Indeed, we could probably go a long way by simply extending our notion > of region so as to allow it to be non-contiguous. > > Patches welcome, This is no patch, but I had an idea for the interface for this: Definition: simple region The interval (possibly empty) between point and mark, exactly as it is now. Variable: region-list A set of non-empty, disjoint intervals, always local to each buffer. Each is a cons of two markers. Typically each is highlighted in a subtle fashion, even outside Transient Mark Mode. Function: multi-region Returns the union of the region list and the simple region (using `point-marker' and/or `mark-marker' as needed). (If the simple region is empty and the region list is not, the simple region is ignored and the return value equals `region-list'.) This is the user-visible possibly-disconnected upgrade to the region concept. User option: multi-region-separator (default: "\n") String to insert between separate intervals of the multi-region when concatenated. (defun multi-region-string (&optional sep) "Return the contents of the multi-region. Separate intervals with SEP (or `multi-region-separator' if omitted)." (mapconcat (lambda (c) (buffer-substring (car c) (cdr c))) (multi-region) (or sep multi-region-separator))) Rule: (interactive "r") maps over the multi-region. Perhaps with some way to disable it (prefix command, or just a quick way to suppress/restore the region list while leaving the simple region alone), `call-interactively' would handle an interactive spec once (including any prompting), then repeatedly call the function with the start and end set to the start and end of each interval in the multi-region in turn, in buffer order. Rationale: This is a very intrusive change! But it's often the right thing (delete-region, upcase-region, ispell-region, translate-region, underline-region, indent-region, count-lines-region, expand-region-abbrevs, and probably eval-region) and is one of very few ways of letting existing code apply in any sense to multi-regions. (If doing it by default is too much, a prefix "mutlify" command could be provided instead, and all of this could be optional.) Another spec ("R"?) could be added for commands like `narrow-to-region' that should either operate only on the simple region (or fail if the region isn't simple?). Yet another spec might pass all of the multi-region at once so that commands like `kill-region' and `write-region' could use `multi-region-string' or otherwise act on them coherently. Command: keep-region Unions the current simple region into the region list (may coalesce existing intervals). Immediately afterwards, the simple region is entirely redundant and has no effect (until point or mark moves). Command: drop-region Removes the current simple region from the region list (may split existing intervals). Immediately afterwards, the multi-region is no different! Command: drop-this-region Remove the interval that contains point from the region list. Command: drop-multi-region Clears the region list (causing the multi-region to equal the simple region). These low-level commands would be too tedious to be the principal user mechanism for manipulating the multi-region. So we add: Command: mark-regexp Add to the region list all matches for a regexp (following point, for consistency with `how-many' and `keep-lines'). Framing the regexp with ^.*....*$ allows this command to mark lines (or a separate command could do that for you). Even when lines are marked in that fashion, the newlines between them are not, so each line is a separate interval. Command: unmark-regexp Delete from the region list all regions within which a match for a regexp exists. These are analogous to the "highlight all" feature in Firefox, for instance. Then we can navigate among them: Command: next-region Move point to the closest following beginning of a region list interval. This could be used in macros. Command: count-regions Display in the echo area how many intervals are in the region list and the multi-region (which may be one more or many fewer). Since region lists are complicated things, the user might want to save them and reuse them later, so letting registers hold them would be good. (Should they store the region list or the multi-region?) WDOT? Davis -- This product is sold by volume, not by mass. If it appears too dense or too sparse, it is because mass-energy conversion has occurred during shipping.