From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ken Newsgroups: gmane.emacs.help Subject: Re: bug in elisp... or in elisper??? Date: Wed, 23 Mar 2011 10:18:34 -0400 Message-ID: <4D8A013A.1030804@mousecar.com> References: <4D8932A8.9080007@mousecar.com> Reply-To: gebser@mousecar.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1300890123 14276 80.91.229.12 (23 Mar 2011 14:22:03 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 23 Mar 2011 14:22:03 +0000 (UTC) Cc: GNU Emacs List To: PJ Weisberg Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Mar 23 15:21:54 2011 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Q2Owc-0005yp-Ek for geh-help-gnu-emacs@m.gmane.org; Wed, 23 Mar 2011 15:21:54 +0100 Original-Received: from localhost ([127.0.0.1]:50536 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Q2Owb-00049b-75 for geh-help-gnu-emacs@m.gmane.org; Wed, 23 Mar 2011 10:21:53 -0400 Original-Received: from [140.186.70.92] (port=44348 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Q2Otk-0003MD-Cq for help-gnu-emacs@gnu.org; Wed, 23 Mar 2011 10:18:57 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Q2Otj-0006Jn-2c for help-gnu-emacs@gnu.org; Wed, 23 Mar 2011 10:18:56 -0400 Original-Received: from mout.perfora.net ([74.208.4.194]:53763) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Q2Oti-0006JV-U7 for help-gnu-emacs@gnu.org; Wed, 23 Mar 2011 10:18:55 -0400 Original-Received: from dellap.mousecar.net (dsl093-011-016.cle1.dsl.speakeasy.net [66.93.11.16]) by mrelay.perfora.net (node=mrus1) with ESMTP (Nemesis) id 0Lnh6V-1PWOfO4275-00hcVh; Wed, 23 Mar 2011 10:18:52 -0400 User-Agent: Thunderbird 2.0.0.24 (X11/20101213) In-Reply-To: X-Enigmail-Version: 0.96.0 OpenPGP: id=5AD091E7 X-Provags-ID: V02:K0:d7mZNyWa625AAVYhrYRokbwU3nJ96SjZzy8BV6+2jS8 61KDew/xpFJNypLPM9vxBHqIGSkcjFJGloBw4X/8lDYDM+dTtu xD11cqdyhdkWWZAFAiQIIfK8d2YEuEf3v2t+9X3SYD8E2mryhn aTeQFAeLmg3DF+dz6ctKyjt9PmzlsFDsgBKo9a/jvq0DIJMUTh TGJLcl8GiM+wvjx1v/jplP4Rcl8vSEX/ki1OnAhvqc= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 74.208.4.194 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:80337 Archived-At: On 03/22/2011 08:15 PM PJ Weisberg wrote: > On 3/22/11, ken wrote: >> Fellow elispers, >> >> Something seems to be amiss in the search syntax here: >> >> (setq aname-re-str >> "\\(\\(.\\|\n\\)*?\\)> \\|\t\\|\n\\)*?\\)>" ) >> > ... >> The problem is that the 5th match-string should be either empty or >> whitespace. But it consistently contains the last character of of the >> 4th match-string. And these two matches are separated by the literal >> character string, " > You miscounted your '('s. The fifth group IS inside the fourth group, > matching . or \n. > > -PJ It wasn't that I miscounted. I read a doc which said that I couldn't embed one potential match expression inside another. (I mentioned this, I believe, in a previous email.) So I figured that, if this wasn't allowed, I certainly couldn't count each expression inside a pair of parens as another match. But it seems that doc was wrong. So this is actually good news: my RE works just as I want it to *and* there's no bug in elisp to contend with. I am, however, starting to have trust issues with documentation I find on the web. But I have you guys here on this list as a reality check. If one match expression *can* be embedded within another, this is good news: it means I can write more comprehensive REs. I.e., instead of writing RE #1 to locate a section of text and then RE #2 to parse just that section, REs #1 and #2 can be combined into one RE. Radically cool. So some further questions: You might have noticed I use "\\([\s-\\|\n]+?\\)" to non-greedily match one or more whitespace characters. Can one "\\[...\\] be nested inside another...? e.g., "[[\s-\\|\n]+?]" or some syntax like that? The "specialness" of "." seems to be lost when inside brackets; that is, in "[.\n]*?" it seems to represent a regular period (.) rather than "any character except newline". Is there some way to bring back that specialness? Or is there some other RE to represent "multiple instances of any character, including a newline"? Is it actually true (what the docs say) that there's a limit of nine sub-expression match-strings per RE? Or can I do, e.g., "(match-string 12)" and "(match-string 15)"? What is the actual limit? Whatever it is, is this hard-coded into elisp... or can it be changed/configured to something else? Thanks for the illumination.