From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Tim X Newsgroups: gmane.emacs.help Subject: Re: Perl, etc has these "?"-prefix modifiers/codes/whatever. Precisely which does emacs have (and NOT have)? Date: Fri, 19 Feb 2010 17:48:33 +1100 Organization: Rapt Technologies Message-ID: <87d40167r2.fsf@lion.rapttech.com.au> References: <877hqaojg9.fsf@galatea.lan.informatimago.com> <873a0ynz99.fsf@galatea.lan.informatimago.com> <87r5oi11bb.fsf@galatea.lan.informatimago.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1266565243 27919 80.91.229.12 (19 Feb 2010 07:40:43 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 19 Feb 2010 07:40:43 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Feb 19 08:40:40 2010 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1NiNTb-0001RS-Qe for geh-help-gnu-emacs@m.gmane.org; Fri, 19 Feb 2010 08:40:40 +0100 Original-Received: from localhost ([127.0.0.1]:58518 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NiNTb-0008UZ-As for geh-help-gnu-emacs@m.gmane.org; Fri, 19 Feb 2010 02:40:39 -0500 Original-Path: news.stanford.edu!usenet.stanford.edu!news.glorb.com!news2.glorb.com!news.astraweb.com!border2.newsrouter.astraweb.com!not-for-mail Original-Newsgroups: gnu.emacs.help User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.92 (gnu/linux) Cancel-Lock: sha1:EtMTndcTTm4H6xz2JufIdFfOmqM= Original-Lines: 127 Original-NNTP-Posting-Host: cb7585f0.news.astraweb.com Original-X-Trace: DXC=13DVYUL==iZLZ1SgaGjiF^:N?Wj[LoOn\^J:1TlA[]=RV; [:Q`: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:71949 Archived-At: John Withers writes: > On Fri, 2010-02-19 at 02:06 +0100, Pascal J. Bourguignon wrote: > >> >> One difficulty when you try to extend regular expression is that the >> time and space complexity of matching such an extended regular >> expression easily becomes exponential. In these cases, it may be easier >> to write a parser, than to try to force it thru regular expressions, >> both for the programmer's brain and for the CPU processor... > > Sure exponential backtracking can happen, you can write checks for > common cases and aborts, but let's say you don't. Who cares? I can write > things that go exponential for memory or clock ticks in any of the > languages I am even trivially familiar with. > >> Otherwise, people will do anything they want to do, theory and >> precendent nonobstant. This only demonstrate the lack of culture of the >> newcomers. > > Or it demonstrates the need to get things done. I can write a regex to > do a transform on 1000 text files in a directory and do the operation > before you have closed the last paren on your parser. > I'm always amazed at these sort of claims because they are just so meaningless. for every concrete example you can come up with, we can come up with others where writing the parser will be faster and more reliable than using REs. The real 'trick' is knowing when RE is the best solution and when a parser is better. Ironically, its often the individuals grasp of the underlying theory that will tend to determine whether they make the right or wrong decision. > But I do appreciate theoretical purity and those who have the expanses > of free time in which to cultivate it. > Making some artificial distinction between the theoretical and the practical is nonsense. You need both. A very high proportion of problems I've seen people having with REs have been due to a lack of theoretical understanding of how RE work, their strengths and their weaknesses. I've seen far too many bad uses of RE than good ones. A common example is using regexp to parse HTML documents. This is almost always the wrong solution and will generally only give you a partially correct result. Correctness will degrade sharpely with the number of HTML docs needing to be processed. i.e. if you just have one document, you can probably tweak your RE to give a correct result, but if you have to process hundreds or thousands of such documents, you will end up spending far more time constantly tweaking and maintaining your regular expressions than you would have spent writing a simple html parser. Much of the reason why REs are not good for this job is bound up in the theory underlying REs Its interesting to note that one of the more significant issues facing ruby has been with respect to its handling of REs and the problems they have had in getting them to work efficiently. Its been a while since I examined progress in this area, but the last time I looked, the extended RE features being discussed here were the central problem and has resulted in a situation where they are slow enough to make them pretty much worthless from a practical standpoint of getting work done! I encountered this first hand where I needed to parse a large number of log files that were quite large. While it worked well, it was too slow and used a lot of memory. In the end, I re-wrote the scripts using a simple parser. It took less time to write, ran faster and used less memory. The code was also a lot clearer and easier to maintain. As a consequence, if I need to use REs I'll use perl, but if I plan to use a simple parser, I'll probably use ruby. Another interesting point is that I suspect ruby has a lot more active contributors than emacs, yet they hadn't been able to greatly improve the efficiency of REs despite considerable effort being put into the problem (not sure what the current state is). This I think supports Pascal's point. What use would extensions to emacs REs be if those extensions so adversely affected performance that using them became impractical for anything but trivial RE problems that can already be handled with what we have? Emacs REs are certainly not my favorite RE implementation, but thats not because of a lack of the RE extensions that perl has made so popular. I personally find all the '\' a much bigger PITA. Emacs, like other open source projects is largely about scratching your own itch. If emacs doesn't have RE features someone wants, either they use a different tool or they get off their arses an implement it. Moaning about it and been critical because it doesn't have a feature is just a lot of hot air. The fact it hasn't been done yet probably means everyone else who has wanted that particular itch scratched has found a more efficient means of doing it using another tool or a different approach. Sometimes it may be simply that the person feels the task is too daunting or too demanding or they simply don't have the necessary skills. If this is the case, then maybe a better approach is to play the role of facilitator or coordinator and try to find others who are interested in contributing towards the same goals. On the other hand, if all anyone is interested in is just moaning and doing nothing, then they will get pretty much exactly what they deserve - sweet FA. The same goes for posts like the OPs in this thread. Rather than asking someone else to do the work, why not do it and contribute it back to the project. If you don't have the skills, make a start and do what you can and then ask for help. It is far more likely others will be willing to assist when they see a real effort being put in. Having a go will also result in more specific questions, which are always easier to answer than vague broad ones. If done well, the information could easily be added to the manual and contribute to overall improvements for other users. It could even be started as a page on the emacs wiki, whihc wold make it easier for others to contribute and improve. Posts like "Plese, someone else do something that I want" rarely achieves anything other than make readers think its just a moan from someone who is frustrated but not frustrated enough to do anything about their problem except moan. While its fine to be lazy, being lazy and fussy is just a recipie to make one miserable. Being lazy, fussy and a moaner just adds noise that makes it harder to find relevant information. Tim -- tcross (at) rapttech dot com dot au