unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* delete text block with regexp
@ 2003-11-16  9:14 Herbert Fritsch
  2003-11-16 10:19 ` Oliver Scholz
  2003-11-16 18:19 ` Alex Schroeder
  0 siblings, 2 replies; 10+ messages in thread
From: Herbert Fritsch @ 2003-11-16  9:14 UTC (permalink / raw)


Hi

There are some text blocks. They all are different but beginn with the same
word and end with the same "lastword*".There are many lines. How can I
delete these blocks with regular expressions? 

Thanks for help

Herbert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: delete text block with regexp
  2003-11-16  9:14 delete text block with regexp Herbert Fritsch
@ 2003-11-16 10:19 ` Oliver Scholz
  2003-11-16 18:19 ` Alex Schroeder
  1 sibling, 0 replies; 10+ messages in thread
From: Oliver Scholz @ 2003-11-16 10:19 UTC (permalink / raw)


Herbert Fritsch <herbfritsch@t-online.de> writes:
[...]
> There are some text blocks. They all are different but beginn with the same
> word and end with the same "lastword*".There are many lines. How can I
> delete these blocks with regular expressions? 

C-M-% ^beginword\(?:.\|<newline>\)*lastword$ RET RET

                        |-----|
                           ^
                           |
                           |
Press `C-q C-j' here. -----+

    Oliver
-- 
26 Brumaire an 212 de la Révolution
Liberté, Egalité, Fraternité!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: delete text block with regexp
  2003-11-16  9:14 delete text block with regexp Herbert Fritsch
  2003-11-16 10:19 ` Oliver Scholz
@ 2003-11-16 18:19 ` Alex Schroeder
  2003-11-16 23:42   ` Stefan Monnier
  1 sibling, 1 reply; 10+ messages in thread
From: Alex Schroeder @ 2003-11-16 18:19 UTC (permalink / raw)


Herbert Fritsch <herbfritsch@t-online.de> writes:

> There are some text blocks. They all are different but beginn with the same
> word and end with the same "lastword*".There are many lines. How can I
> delete these blocks with regular expressions? 

The critical part is that "." matches anything but a newline.
Therefore use "\(.\|\n\)" for any character or a newline.

Alex.
-- 
http://www.emacswiki.org/alex/
There is no substitute for experience.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: delete text block with regexp
  2003-11-16 18:19 ` Alex Schroeder
@ 2003-11-16 23:42   ` Stefan Monnier
  2003-11-17  0:32     ` Edi Weitz
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Monnier @ 2003-11-16 23:42 UTC (permalink / raw)


> The critical part is that "." matches anything but a newline.
> Therefore use "\(.\|\n\)" for any character or a newline.

Beware: such a regexp tends to suffer from the "regexp stack overflow"
problem.  Better use \(.*\n\)*.* which is equivalent but uses a lot
less stack space.

Sadly, to understand why you need to understand details of how the regexp
matching happens to be implemented in Emacs.


        Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: delete text block with regexp
  2003-11-16 23:42   ` Stefan Monnier
@ 2003-11-17  0:32     ` Edi Weitz
  2003-11-17  5:53       ` Stefan Monnier
  0 siblings, 1 reply; 10+ messages in thread
From: Edi Weitz @ 2003-11-17  0:32 UTC (permalink / raw)


On Sun, 16 Nov 2003 23:42:37 GMT, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>> The critical part is that "." matches anything but a newline.
>> Therefore use "\(.\|\n\)" for any character or a newline.
>
> Beware: such a regexp tends to suffer from the "regexp stack
> overflow" problem.  Better use \(.*\n\)*.* which is equivalent but
> uses a lot less stack space.
>
> Sadly, to understand why you need to understand details of how the
> regexp matching happens to be implemented in Emacs.

Shouldn't a good regex implementation be able to optimize the problem
away in simple cases like this? I've written a regex engine for Common
Lisp which does transformations like (Perl syntax)

  <regex>*   ->   (?:<regex'>*<regex>)?
  <regex>+   ->   <regex'>*<regex>

if <regex> includes register groups and is of fixed length. <regex'>
is an equivalent regular expression but without the register
groups. I'm pretty sure Perl does something similar.

Edi.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: delete text block with regexp
  2003-11-17  0:32     ` Edi Weitz
@ 2003-11-17  5:53       ` Stefan Monnier
  2003-11-17 10:31         ` Edi Weitz
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Monnier @ 2003-11-17  5:53 UTC (permalink / raw)


> Shouldn't a good regex implementation be able to optimize the problem
> away in simple cases like this? I've written a regex engine for Common

Of course.  Patch (or package) welcome,


        Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: delete text block with regexp
  2003-11-17  5:53       ` Stefan Monnier
@ 2003-11-17 10:31         ` Edi Weitz
  2003-11-17 15:23           ` Stefan Monnier
  0 siblings, 1 reply; 10+ messages in thread
From: Edi Weitz @ 2003-11-17 10:31 UTC (permalink / raw)


On Mon, 17 Nov 2003 05:53:05 GMT, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>> Shouldn't a good regex implementation be able to optimize the
>> problem away in simple cases like this?
>
> Of course.  Patch (or package) welcome,

Er, but isn't Emacs' regex engine written in C? Yuk! I prefer writing
my software in Lisp... :)

Cheers,
Edi.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: delete text block with regexp
  2003-11-17 10:31         ` Edi Weitz
@ 2003-11-17 15:23           ` Stefan Monnier
  2003-11-17 15:42             ` Oliver Scholz
  2003-11-17 16:23             ` Edi Weitz
  0 siblings, 2 replies; 10+ messages in thread
From: Stefan Monnier @ 2003-11-17 15:23 UTC (permalink / raw)


>>> Shouldn't a good regex implementation be able to optimize the
>>> problem away in simple cases like this?
>> Of course.  Patch (or package) welcome,
> Er, but isn't Emacs' regex engine written in C? Yuk! I prefer writing
> my software in Lisp... :)

But you can write the optimization in elisp, as is done in regexp-opt.
It won't be automatically used for all regexps, but it might be a good
thing if your optimization takes time.


        Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: delete text block with regexp
  2003-11-17 15:23           ` Stefan Monnier
@ 2003-11-17 15:42             ` Oliver Scholz
  2003-11-17 16:23             ` Edi Weitz
  1 sibling, 0 replies; 10+ messages in thread
From: Oliver Scholz @ 2003-11-17 15:42 UTC (permalink / raw)


Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>>> Shouldn't a good regex implementation be able to optimize the
>>>> problem away in simple cases like this?
>>> Of course.  Patch (or package) welcome,
>> Er, but isn't Emacs' regex engine written in C? Yuk! I prefer writing
>> my software in Lisp... :)
>
> But you can write the optimization in elisp, as is done in regexp-opt.
[...]

Or for the s-expr front-ends like `rx' or `sregex'.

    Oliver
-- 
27 Brumaire an 212 de la Révolution
Liberté, Egalité, Fraternité!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: delete text block with regexp
  2003-11-17 15:23           ` Stefan Monnier
  2003-11-17 15:42             ` Oliver Scholz
@ 2003-11-17 16:23             ` Edi Weitz
  1 sibling, 0 replies; 10+ messages in thread
From: Edi Weitz @ 2003-11-17 16:23 UTC (permalink / raw)


On Mon, 17 Nov 2003 15:23:52 GMT, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> But you can write the optimization in elisp, as is done in
> regexp-opt.  It won't be automatically used for all regexps, but it
> might be a good thing if your optimization takes time.

Oh, I didn't know that. I might look at it if I find some time.

Thanks,
Edi.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-11-17 16:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-16  9:14 delete text block with regexp Herbert Fritsch
2003-11-16 10:19 ` Oliver Scholz
2003-11-16 18:19 ` Alex Schroeder
2003-11-16 23:42   ` Stefan Monnier
2003-11-17  0:32     ` Edi Weitz
2003-11-17  5:53       ` Stefan Monnier
2003-11-17 10:31         ` Edi Weitz
2003-11-17 15:23           ` Stefan Monnier
2003-11-17 15:42             ` Oliver Scholz
2003-11-17 16:23             ` Edi Weitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).