From: Alan Mackenzie <acm@muc.de>
To: Marcin Borkowski <mbork@amu.edu.pl>
Cc: 19873@debbugs.gnu.org
Subject: bug#19873: Ill-formed regular expression is constructed in forward-paragraph.
Date: Thu, 9 Mar 2017 21:04:45 +0000 [thread overview]
Message-ID: <20170309210445.GB4046@acm> (raw)
In-Reply-To: <87o9xodhq4.fsf@jane>
Hello, Marcin.
On Sun, Feb 26, 2017 at 17:44:51 +0100, Marcin Borkowski wrote:
> On 2015-02-15, at 10:31, Alan Mackenzie <acm@muc.de> wrote:
> > Hello, Emacs!
> > In forward-paragraph, L37, a regular expression is constructed as
> > follows:
> > (let* ...
> > (sp-parstart (concat "^[ \t]*\\(?:" parstart "\\|" parsep "\\)"))
> > ...)
> > . Here parstart and parsep are, more or less,
> > paragraph-{start,separate}.
> > The problem is that parstart and parsep themselves are likely to begin
> > with "[ \t]*" (the default values certainly do), so we have two
> > consecutive matchers for an arbitrary amount of whitespace. This causes
> > the regexp engine to run very slowly when a line starts with lots of WS
> > but doesn't match.
> > This problem seems to be the cause of bug # 19846 (where holding down the
> > spacebar inside a C comment causes Emacs to seize up when auto-fill mode
> > is enabled).
> Hi Alan, hi all,
> I put this bug on my todo-list some time ago and decided now to revisit
> it.
> I'm wondering what could be done about it. First of all, my Emacs has
> this as paragraph-start:
> "\f\\|[ ]*$"
> and this as paragraph-separate:
> "[ \f]*$"
> and frankly speaking, I'm not sure why they differ at all (by default).
> Also, even though forward-paragraph checks for "^" at their beginning,
> they actually don't begin with that character (again, by default).
> My first thought is to add a check whether paragraph-start and
> paragraph-sep match something like
> "^\\^?\\[[[:space:]]+\\][+*]?"
> and if yes, make parstart/parsep equal to them, but without the matching
> part.
> WDYT?
My first reaction is "This is a good idea, but be very careful!". For
example, if paragraph-start and/or paragraph-separate begin with
"[ \t]+" (i.e. the paragraph start requires space at BOL), you will miss
it by removing matches of "^\\^?\\[[[:space:]]+\\][+*]?" from them.
I think this idea is workable, but you'll have to check for one or both
of paragraph-s{tart,eparate} starting with "[ \t]+". A good strategy
here might be to begin the target regexp with "^[ \t]*", then begin one
or both components with "[ \t]" (without the "*").
There may be other gotchas which I haven't thought about yet.
One needs a twisted mind to do this sort of thing properly, so I offer my
services to review your upcoming patch. ;-)
> --
> Marcin Borkowski
> http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
> Faculty of Mathematics and Computer Science
> Adam Mickiewicz University
--
Alan Mackenzie (Nuremberg, Germany).
next prev parent reply other threads:[~2017-03-09 21:04 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-15 10:31 bug#19873: Ill-formed regular expression is constructed in forward-paragraph Alan Mackenzie
2017-02-26 16:44 ` Marcin Borkowski
2017-02-26 16:57 ` Eli Zaretskii
2017-02-26 18:48 ` Marcin Borkowski
2017-03-07 16:47 ` Eli Zaretskii
2017-03-09 21:04 ` Alan Mackenzie [this message]
2021-12-02 10:39 ` Lars Ingebrigtsen
2021-12-02 10:44 ` Lars Ingebrigtsen
2021-12-02 11:17 ` Lars Ingebrigtsen
2021-12-02 20:45 ` Alan Mackenzie
2021-12-03 16:15 ` Lars Ingebrigtsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170309210445.GB4046@acm \
--to=acm@muc.de \
--cc=19873@debbugs.gnu.org \
--cc=mbork@amu.edu.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.