unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Tim X <timx@nospam.dev.null>
To: help-gnu-emacs@gnu.org
Subject: Re: Perl, etc has these "?"-prefix modifiers/codes/whatever. Precisely which does emacs have (and NOT have)?
Date: Fri, 19 Feb 2010 17:48:33 +1100	[thread overview]
Message-ID: <87d40167r2.fsf@lion.rapttech.com.au> (raw)
In-Reply-To: mailman.1470.1266547034.14305.help-gnu-emacs@gnu.org

John Withers <grayarea@reddagger.org> writes:

> On Fri, 2010-02-19 at 02:06 +0100, Pascal J. Bourguignon wrote:
>
>> 
>> One difficulty when you try to extend regular expression is that the
>> time and space complexity of matching such an extended regular
>> expression easily becomes exponential.  In these cases, it may be easier
>> to write a parser, than to try to force it thru regular expressions,
>> both for the programmer's brain and for the CPU processor...
>
> Sure exponential backtracking can happen, you can write checks for
> common cases and aborts, but let's say you don't. Who cares? I can write
> things that go exponential for memory or clock ticks in any of the
> languages I am even trivially familiar with.
>
>> Otherwise, people will do anything they want to do, theory and
>> precendent nonobstant.  This only demonstrate the lack of culture of the
>> newcomers.
>
> Or it demonstrates the need to get things done. I can write a regex to
> do a transform on 1000 text files in a directory and do the operation
> before you have closed the last paren on your parser.
>

I'm always amazed at these sort of claims because they are just so
meaningless. for every concrete example you can come up with, we can
come up with others where writing the parser will be faster and more
reliable than using REs. 

The real 'trick' is knowing when RE is the best solution and when a
parser is better. Ironically, its often the individuals grasp of the
underlying theory that will tend to determine whether they make the
right or wrong decision. 

> But I do appreciate theoretical purity and those who have the expanses
> of free time in which to cultivate it.
>

Making some artificial distinction between the theoretical and the
practical is nonsense. You need both. A very high proportion of problems
I've seen people having with REs have been due to a lack of
theoretical understanding of how RE work, their strengths and their
weaknesses. 

I've seen far too many bad uses of RE than good ones. A common example
is using regexp to parse HTML documents. This is almost always the wrong
solution and will generally only give you a partially correct result.
Correctness will degrade sharpely with the number of HTML docs needing
to be processed. i.e. if you just have one document, you can probably
tweak your RE to give a correct result, but if you have to process
hundreds or thousands of such documents, you will end up spending far
more time constantly tweaking and maintaining your regular expressions
than you would have spent writing a simple html parser. Much of the
reason why REs are not good for this job is bound up in the theory
underlying REs

Its interesting to note that one of the more significant issues facing
ruby has been with respect to its handling of REs and the problems they
have had in getting them to work efficiently. Its been a while since I
examined progress in this area, but the last time I looked, the extended
RE features being discussed here were the central problem and has
resulted in a situation where they are slow enough to make them pretty
much worthless from a practical standpoint of getting work done! I
encountered this first hand where I needed to parse a large number of
log files that were quite large. While it worked well, it was too slow
and used a lot of memory. In the end, I re-wrote the scripts using a
simple parser. It took less time to write, ran faster and used less
memory. The code was also a lot clearer and easier to maintain. As a
consequence, if I need to use REs I'll use perl, but if I plan to use a
simple parser, I'll probably use ruby.

Another interesting point is that I suspect ruby has a lot more active
contributors than emacs, yet they hadn't been able to greatly improve
the efficiency of REs despite considerable effort being put into the
problem (not sure what the current state is).

This I think supports Pascal's point. What use would extensions to emacs
REs be if those extensions so adversely affected performance that using
them became impractical for anything but trivial RE problems that can
already be handled with what we have?

Emacs REs are certainly not my favorite RE implementation, but thats not
because of a lack of the RE extensions that perl has made so popular. I
personally find all the '\' a much bigger PITA. 

Emacs, like other open source projects is largely about scratching your
own itch. If emacs doesn't have RE features someone wants, either they
use a different tool or they get off their arses an implement it.
Moaning about it and been critical because it doesn't have a feature is
just a lot of hot air. The fact it hasn't been done yet probably means
everyone else who has wanted that particular itch scratched has found a
more efficient means of doing it using another tool or a different
approach. 

Sometimes it may be simply that the person feels the task is too
daunting or too demanding or they simply don't have the
necessary skills. If this is the case, then maybe a better approach is
to play the role of facilitator or coordinator and try to find others
who are interested in contributing towards the same goals. On the other
hand, if all anyone is interested in is just moaning and doing nothing,
then they will get pretty much exactly what they deserve - sweet FA.

The same goes for posts like the OPs in this thread. Rather than asking
someone else to do the work, why not do it and contribute it
back to the project. If you don't have the skills, make a start and do
what you can and then ask for help. It is far more likely others will be
willing to assist when they see a real effort being put in. Having a go
will also result in more specific questions, which are always easier to
answer than vague broad ones. If done well, the information could easily
be added to the manual and contribute to overall improvements for other
users. It could even be started as a page on the emacs wiki, whihc wold
make it easier for others to contribute and improve.

Posts like "Plese, someone else do something that I want" rarely
achieves anything other than make readers think its just a moan from
someone who is frustrated but not frustrated enough to do anything about
their problem except moan. While its fine to be lazy, being lazy and
fussy is just a recipie to make one miserable. Being lazy, fussy and a
moaner just adds noise that makes it harder to find relevant
information. 

Tim


-- 
tcross (at) rapttech dot com dot au


  parent reply	other threads:[~2010-02-19  6:48 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-18  6:10 Perl, etc has these "?"-prefix modifiers/codes/whatever. Precisely which does emacs have (and NOT have)? David Combs
2010-02-18 11:46 ` Pascal J. Bourguignon
2010-02-18 16:57   ` John Withers
     [not found]   ` <mailman.1450.1266512270.14305.help-gnu-emacs@gnu.org>
2010-02-18 19:02     ` Pascal J. Bourguignon
2010-02-18 21:38       ` John Bokma
2010-02-18 21:42       ` John Withers
     [not found]       ` <mailman.1460.1266529372.14305.help-gnu-emacs@gnu.org>
2010-02-19  0:53         ` David Combs
2010-02-19  1:06         ` Pascal J. Bourguignon
2010-02-19  2:36           ` John Withers
     [not found]           ` <mailman.1470.1266547034.14305.help-gnu-emacs@gnu.org>
2010-02-19  6:48             ` Tim X [this message]
2010-02-20 21:14               ` John Withers
     [not found]               ` <mailman.1559.1266700478.14305.help-gnu-emacs@gnu.org>
2010-02-23 12:33                 ` Tim Landscheidt
2010-02-18 16:23 ` Tyler Smith
     [not found] ` <mailman.1449.1266510261.14305.help-gnu-emacs@gnu.org>
2010-02-19  0:59   ` David Combs
2010-02-19  3:22     ` Tyler Smith
2010-02-24 19:54 ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d40167r2.fsf@lion.rapttech.com.au \
    --to=timx@nospam.dev.null \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).