From: Nathan Trapuzzano <nbtrap@nbtrap.com>
To: 11216@debbugs.gnu.org
Subject: bug#11216: 23.4; parenthesis matching breaks on certain complex expressions
Date: Tue, 10 Apr 2012 20:18:47 -0400 [thread overview]
Message-ID: <20120410201847.38e2af1f@nbtrap.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2944 bytes --]
Here's a complex regular expression that breaks parenthesis matching
(and yes, that's a real regular expression generated from a real perl
program).
M[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*H[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*\=[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*N[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*I[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*N(?![\x21\x27\x2a\x2d\x2f\x3d\x41-\x5a\x5c\x61-\x7a\x7c]) (?<!S\d)(?<!\-\ [@"]\d\ [\x80-\xff])(?<!\-[\x80-\xff][\x80-\xff])(?<!\-[\x80-\xff])(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c]\*)(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c])(?:
A\)[\/\\]|\*\)[\/\\]A[\=\/\\]?)[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*I[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*D[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[\=\/\\]?(?![\x21\x27\x2a\x2d\x2f\x3d\x41-\x5a\x5c\x61-\x7a\x7c]) (?<!S\d)(?<!\-\ [@"]\d\ [\x80-\xff])(?<!\-[\x80-\xff][\x80-\xff])(?<!\-[\x80-\xff])(?<![\x27-\x29\x2f\x3d\x41-\x5a
\x7c]\*)(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c])Q[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*A[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*[\/\\]
Matching gets messed up with the open parenthesis immediately following
the first (?<!S\d). I suspect this is due to the opening double-quote
about 10 characters later.
I noticed that the behavior of show-paren-mode changes depending on the
major mode. For example, the behavior described above happens in
fundamental mode, whereas when I switch to text mode, quotation marks
are ignored. However, switching to text mode also causes paren-matching
to ignore back-slashes and thus escaped parentheses/brackets. I think
the best fix would be to enable customization of show-paren-mode so
that the user can specify which characters should be ignored when
matching parentheses.
I've also attached a file containing the regexp in question in case the long line gets broken up over mail transmission.
[-- Attachment #2: regexp.txt --]
[-- Type: text/plain, Size: 1996 bytes --]
M[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*H[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*\=[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*N[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*I[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*N(?![\x21\x27\x2a\x2d\x2f\x3d\x41-\x5a\x5c\x61-\x7a\x7c]) (?<!S\d)(?<!\-\ [@"]\d\ [\x80-\xff])(?<!\-[\x80-\xff][\x80-\xff])(?<!\-[\x80-\xff])(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c]\*)(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c])(?:
A\)[\/\\]|\*\)[\/\\]A[\=\/\\]?)[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*I[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*D[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[\=\/\\]?(?![\x21\x27\x2a\x2d\x2f\x3d\x41-\x5a\x5c\x61-\x7a\x7c]) (?<!S\d)(?<!\-\ [@"]\d\ [\x80-\xff])(?<!\-[\x80-\xff][\x80-\xff])(?<!\-[\x80-\xff])(?<![\x27-\x29\x2f\x3d\x41-\x5a
\x7c]\*)(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c])Q[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*A[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*[\/\\]
next reply other threads:[~2012-04-11 0:18 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-11 0:18 Nathan Trapuzzano [this message]
2012-04-11 7:09 ` bug#11216: 23.4; parenthesis matching breaks on certain complex expressions Andreas Schwab
2019-11-01 20:11 ` Stefan Kangas
2012-04-12 2:08 ` Stefan Monnier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120410201847.38e2af1f@nbtrap.com \
--to=nbtrap@nbtrap.com \
--cc=11216@debbugs.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).