From: Stefan Monnier <monnier@IRO.UMontreal.CA>
To: Andreas Schwab <schwab@linux-m68k.org>
Cc: Alan Mackenzie <acm@muc.de>, 18577@debbugs.gnu.org
Subject: bug#18577: Regexp I-search: [(error Stack overflow in regexp matcher)]
Date: Sun, 28 Sep 2014 13:35:30 -0400 [thread overview]
Message-ID: <jwv38bbsme9.fsf-monnier+emacsbugs@gnu.org> (raw)
In-Reply-To: <87d2afud0p.fsf@igel.home> (Andreas Schwab's message of "Sun, 28 Sep 2014 14:48:22 +0200")
>> Is this a defect in my regexp or in the regexp engine?
> It is fundamental to the way regexp matching works.
To clarify: it is fundamental to the way *our* regexp engine works.
As long as the regexp doesn't use backrefs, it can be matched
efficiently, without backtracking. Of course using \(..\) (as opposed
to using \(?:..\)) can also make the problem harder since the various
different (but largely equivalent) ways to match might need to be
distinguishable via match-data.
But even tho your regexp doesn't use backrefs, and even if you replace
all \(..\) with \(?:..\), your regexp will still cause problems because
our regexp engine does not try to optimize these kinds of cases.
So you have to do it by hand.
>> If the former, how could I rewrite the regexp so that it would not hit
>> these problems?
Maybe something like:
/\*\(<insidecomment>\)*\*+/
where <insidecomment> is something like
[^'*]\|\*+\([^/'*]\|'<afterquote>\)\|'<afterquote>
where <afterquote> is something like
\([^'*]\|\*+[^/'*]\)*'
Tho this will still push a backtrack point for every character.
Maybe better would be something like
/\*[^'*]*\(<insidecomment>\)*\*+/
where <insidecomment> is something like
\(\*+[^/'*]\|\**'<afterquote>\)[^'*]*
where <afterquote> is still something like
\([^'*]\|\*+[^/'*]\)*'
so that we should only push a backtrace point when we see a * or a ' in
the comment.
Stefan
next prev parent reply other threads:[~2014-09-28 17:35 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-28 8:55 bug#18577: Regexp I-search: [(error Stack overflow in regexp matcher)] Alan Mackenzie
2014-09-28 10:56 ` Andreas Schwab
2014-09-28 12:37 ` Alan Mackenzie
2014-09-28 12:48 ` Andreas Schwab
2014-09-28 17:35 ` Stefan Monnier [this message]
2014-11-27 8:44 ` Tassilo Horn
2021-10-23 2:47 ` Stefan Kangas
2021-10-23 7:32 ` Eli Zaretskii
2021-10-23 8:30 ` Stefan Kangas
2021-10-23 8:39 ` Eli Zaretskii
2021-10-23 9:32 ` Stefan Kangas
2021-10-23 11:27 ` Eli Zaretskii
2021-10-24 22:08 ` Stefan Kangas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jwv38bbsme9.fsf-monnier+emacsbugs@gnu.org \
--to=monnier@iro.umontreal.ca \
--cc=18577@debbugs.gnu.org \
--cc=acm@muc.de \
--cc=schwab@linux-m68k.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).