all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stefan Kangas <stefan@marxist.se>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Alan Mackenzie <acm@muc.de>,
	Andreas Schwab <schwab@linux-m68k.org>,
	18577@debbugs.gnu.org
Subject: bug#18577: Regexp I-search: [(error Stack overflow in regexp matcher)]
Date: Fri, 22 Oct 2021 19:47:53 -0700	[thread overview]
Message-ID: <CADwFkmkseWGJ68Xbk1t_Me8gBP5PjiueivCe-XGw+2t1zu4Fxg@mail.gmail.com> (raw)
In-Reply-To: <jwv38bbsme9.fsf-monnier+emacsbugs@gnu.org> (Stefan Monnier's message of "Sun, 28 Sep 2014 13:35:30 -0400")

Stefan Monnier <monnier@IRO.UMontreal.CA> writes:

>>> Is this a defect in my regexp or in the regexp engine?
>> It is fundamental to the way regexp matching works.
>
> To clarify: it is fundamental to the way *our* regexp engine works.
>
> As long as the regexp doesn't use backrefs, it can be matched
> efficiently, without backtracking.  Of course using \(..\) (as opposed
> to using \(?:..\)) can also make the problem harder since the various
> different (but largely equivalent) ways to match might need to be
> distinguishable via match-data.
>
> But even tho your regexp doesn't use backrefs, and even if you replace
> all \(..\) with \(?:..\), your regexp will still cause problems because
> our regexp engine does not try to optimize these kinds of cases.
>
> So you have to do it by hand.
>
>>> If the former, how could I rewrite the regexp so that it would not hit
>>> these problems?
>
> Maybe something like:
>
> /\*\(<insidecomment>\)*\*+/
>
> where <insidecomment> is something like
>
>    [^'*]\|\*+\([^/'*]\|'<afterquote>\)\|'<afterquote>
>
> where <afterquote> is something like
>
>    \([^'*]\|\*+[^/'*]\)*'
>
> Tho this will still push a backtrack point for every character.
> Maybe better would be something like
>
> /\*[^'*]*\(<insidecomment>\)*\*+/
>
> where <insidecomment> is something like
>
>    \(\*+[^/'*]\|\**'<afterquote>\)[^'*]*
>
> where <afterquote> is still something like
>
>    \([^'*]\|\*+[^/'*]\)*'
>
> so that we should only push a backtrace point when we see a * or a ' in
> the comment.

Should we do anything about this, like document it in etc/PROBLEMS, or
should this bug just be closed?





  parent reply	other threads:[~2021-10-23  2:47 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-28  8:55 bug#18577: Regexp I-search: [(error Stack overflow in regexp matcher)] Alan Mackenzie
2014-09-28 10:56 ` Andreas Schwab
2014-09-28 12:37   ` Alan Mackenzie
2014-09-28 12:48     ` Andreas Schwab
2014-09-28 17:35       ` Stefan Monnier
2014-11-27  8:44         ` Tassilo Horn
2021-10-23  2:47         ` Stefan Kangas [this message]
2021-10-23  7:32           ` Eli Zaretskii
2021-10-23  8:30             ` Stefan Kangas
2021-10-23  8:39               ` Eli Zaretskii
2021-10-23  9:32                 ` Stefan Kangas
2021-10-23 11:27                   ` Eli Zaretskii
2021-10-24 22:08                     ` Stefan Kangas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADwFkmkseWGJ68Xbk1t_Me8gBP5PjiueivCe-XGw+2t1zu4Fxg@mail.gmail.com \
    --to=stefan@marxist.se \
    --cc=18577@debbugs.gnu.org \
    --cc=acm@muc.de \
    --cc=monnier@iro.umontreal.ca \
    --cc=schwab@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.