From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Stuart D. Herring" Newsgroups: gmane.emacs.devel Subject: Re: using non-Emacs regexp syntax Date: Fri, 1 Dec 2006 14:35:00 -0800 (PST) Message-ID: <58590.128.165.123.18.1165012500.squirrel@webmail.lanl.gov> Reply-To: herring@lanl.gov NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed;boundary="----=_20061201143500_80483" X-Trace: sea.gmane.org 1165012545 20322 80.91.229.2 (1 Dec 2006 22:35:45 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 1 Dec 2006 22:35:45 +0000 (UTC) Cc: rms@gnu.org, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Dec 01 23:35:39 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1GqGyZ-0004ti-B8 for ged-emacs-devel@m.gmane.org; Fri, 01 Dec 2006 23:35:23 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GqGyY-0005Sc-PM for ged-emacs-devel@m.gmane.org; Fri, 01 Dec 2006 17:35:22 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1GqGyK-0005QN-TS for emacs-devel@gnu.org; Fri, 01 Dec 2006 17:35:08 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1GqGyG-0005Q8-PL for emacs-devel@gnu.org; Fri, 01 Dec 2006 17:35:08 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GqGyG-0005Q5-MU for emacs-devel@gnu.org; Fri, 01 Dec 2006 17:35:04 -0500 Original-Received: from [192.65.95.54] (helo=mailwasher-b.lanl.gov) by monty-python.gnu.org with esmtp (Exim 4.52) id 1GqGyF-00032M-Iu; Fri, 01 Dec 2006 17:35:03 -0500 Original-Received: from mailrelay2.lanl.gov (mailrelay2.lanl.gov [128.165.4.103]) by mailwasher-b.lanl.gov (8.13.8/8.13.8/(ccn-5)) with ESMTP id kB1MZ2eU010773; Fri, 1 Dec 2006 15:35:02 -0700 Original-Received: from webmail1.lanl.gov (webmail1.lanl.gov [128.165.4.106]) by mailrelay2.lanl.gov (8.13.8/8.13.8/(ccn-5)) with ESMTP id kB1MZ1Ga027683; Fri, 1 Dec 2006 15:35:01 -0700 Original-Received: from webmail1.lanl.gov (localhost.localdomain [127.0.0.1]) by webmail1.lanl.gov (8.12.11.20060308/8.12.11) with ESMTP id kB1MZ1Aw011004; Fri, 1 Dec 2006 15:35:01 -0700 Original-Received: (from apache@localhost) by webmail1.lanl.gov (8.12.11.20060308/8.12.11/Submit) id kB1MZ0Fl011001; Fri, 1 Dec 2006 14:35:00 -0800 X-Authentication-Warning: webmail1.lanl.gov: apache set sender to herring@lanl.gov using -f Original-Received: from 128.165.123.18 (SquirrelMail authenticated user 196434) by webmail.lanl.gov with HTTP; Fri, 1 Dec 2006 14:35:00 -0800 (PST) Original-To: "Paul Pogonyshev" User-Agent: SquirrelMail/1.4.6-7.el3.7lanl X-Priority: 3 (Normal) Importance: Normal Original-References: In-Reply-To: X-PMX-Version: 4.7.1.128075 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:63208 Archived-At: ------=_20061201143500_80483 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit > If you don't mind, I'll work on it now. Changes can be added to whatever > .el file in the distribution later. > > Also, is there sense in supporting conversion to and from several formats? > E.g. some require that plus operator is escaped, while everything else is > not. E.g. something like this: > > (convert-regexp :sed :emacs some-regexp) > FROM TO PATTERN-STRING > > Of course, it will add more complexity, but it shouldn't be much of a problem for users of this function and implementing it in Lisp should still > be not hard. I've already started on this sort of thing, writing a converter just between the two formats supported by GNU grep. (These are "GNU-extended-basic-RE" and "extended-RE with backreferences".) As it happens, that conversion can be done with one function because the formats are so similar. I had planned to go on to the more general case, but for now I'll just provide what I have for comment and/or use. (I have papers, so any use is fine.) If, Paul, you'd like, we can collaborate on this, or one of us of your choice can go on with it. For reference/goal purposes, I've been looking at the (somewhat outdated) Mastering Regular Expressions and it describes these syntaxes: 1. vi 2. (modern) grep 3. egrep 4. sed 5. lex 6. old awk 7. new awk(s) (don't know how different they really are from each other or from old awk) 8. Emacs 9. Perl (obviously we can only convert a subset of Perl's syntax...) 10. Tcl 11. a Tcl library called Expect (although I don't know if/why it has a different syntax from Tcl itself) 12. Python (complicated by the old regex and the new re packages, and how the former had a variable syntax) Hope it's helpful, Davis PS - I originally wrote this using some convenience macros of mine. It seems to work after I standardized it, but that's probably why if it doesn't. -- This product is sold by volume, not by mass. If it appears too dense or too sparse, it is because mass-energy conversion has occurred during shipping. ------=_20061201143500_80483 Content-Type: application/octet-stream; name="convert-re.el" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="convert-re.el" OzsgUmVtZW1iZXIgdGhlIGV4Y2VlZGluZ2x5LWJhc2ljIHJlZ2V4ZXMgYXMgdXNlZCBieSBzZWQo MSkuLi4gbWlnaHQgbmVlZCB0bwo7OyBzdXBwb3J0IHRoZW0gdG9vLCBhbHRob3VnaCBjb252ZXJ0 aW5nIGludG8gdGhlbSBjYW4gYmUgYSBwYWluLiAgT2J2aW91c2x5LAo7OyBpbiBnZW5lcmFsIHlv dSBjYW4ndCBoYXZlIGp1c3Qgb25lIGZ1bmN0aW9uLgoKKGRlZnVuIGNvbnZlcnQtcmVnZXhwIChy ZSkKCSJDb252ZXJ0IHRoZSByZWdleHAgUkUgZnJvbSBiYXNpYyB0byBleHRlbmRlZCBmb3JtYXQg b3IgYmFjay4iCgkobGV0ICgoY2hhcnMgKHN0cmluZy10by1saXN0IHJlKSkgcmV0IGJhY2tzbGFz aCkKCQkod2hpbGUgY2hhcnMKCQkJKGxldCAoKGN1cmNoYXIgKGNhciBjaGFycykpKQoJCQkJKGNv bmQKCQkJCSAoKGVxIGN1cmNoYXIgP1xcKQoJCQkJCSh1bmxlc3MgKHNldHEgYmFja3NsYXNoIChu b3QgYmFja3NsYXNoKSkKCQkJCQkJKHB1c2ggP1xcIHJldCkgKHB1c2ggP1xcIHJldCkpKQoJCQkJ ICgoZXEgY3VyY2hhciA/XFspCgkJCQkJKGlmIGJhY2tzbGFzaCAocHJvZ24gKHB1c2ggP1xcIHJl dCkgKHB1c2ggP1xbIHJldCkpCgkJCQkJCTs7IE90aGVyd2lzZSwgaXQncyBhIGNoYXJhY3RlciBj bGFzczoKCQkJCQkJKHB1c2ggP1xbIHJldCkKCQkJCQkJKHNldHEgY2hhcnMgKGNkciBjaGFycykp CgkJCQkJCShsZXQgKChsZXZlbCAxKSAoZmlyc3QgMCkpCgkJCQkJCQkod2hpbGUgKGFuZCBjaGFy cyAoPiBsZXZlbCAwKSkKCQkJCQkJCQkobGV0ICgoY2xjaCAoY2FyIGNoYXJzKSkpCgkJCQkJCQkJ CShwdXNoIGNsY2ggcmV0KQoJCQkJCQkJCQkoY29uZAoJCQkJCQkJCQkgKChlcSBjbGNoID9cWykg KGluY2YgbGV2ZWwpKQoJCQkJCQkJCQkgKChlcSBjbGNoID9cXSkgKHVubGVzcyBmaXJzdCAoZGVj ZiBsZXZlbCkpKQoJCQkJCQkJCQkgKChlcSBjbGNoID9eKSAoaWYgZmlyc3QgKHNldHEgZmlyc3Qg dCkpKSkpCgkJCQkJCQkJKHNldHEgZmlyc3QgKGFuZCBmaXJzdCAodW5sZXNzIChudW1iZXJwIGZp cnN0KSAwKSkpCgkJCQkJCQkJKHVubGVzcyAoemVyb3AgbGV2ZWwpIChzZXRxIGNoYXJzIChjZHIg Y2hhcnMpKSkpKSkpCgkJCQkgKChtZW1xIGN1cmNoYXIgKHN0cmluZy10by1saXN0ICI/KygpfHt9 IikpCgkJCQkJKHVubGVzcyBiYWNrc2xhc2ggKHB1c2ggP1xcIHJldCkpCgkJCQkJKHB1c2ggKGNh ciBjaGFycykgcmV0KSkKCQkJCSAodCAoaWYgYmFja3NsYXNoIChwdXNoID9cXCByZXQpKSAocHVz aCAoY2FyIGNoYXJzKSByZXQpKSkpCgkJCShzZXRxIGJhY2tzbGFzaCAoYW5kIGJhY2tzbGFzaCAo dW5sZXNzIChudW1iZXJwIGJhY2tzbGFzaCkgMCkpCgkJCQkJCWNoYXJzIChjZHIgY2hhcnMpKSkK CQkoY29uY2F0IChucmV2ZXJzZSByZXQpKSkpCg== ------=_20061201143500_80483 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ------=_20061201143500_80483--