From: Yuri Khan <yuri.v.khan@gmail.com>
To: tomas@tuxteam.de
Cc: emacs-tangents@gnu.org, steve-humphreys@gmx.com
Subject: Negating a regexp
Date: Thu, 20 May 2021 17:29:50 +0700 [thread overview]
Message-ID: <CAP_d_8UJ5xwPG9=jQyjK=vz5vVRYuA6pvPna+uVg5PvjvgBvAQ@mail.gmail.com> (raw)
In-Reply-To: <20210520095603.GD1127@tuxteam.de>
> > How could I negate the regexp that I have defined?
>
> I don't even know what you mean by "negating a regexp".
The automata theory, where the notion of a regexp comes from, defines
a “regular set” as one that can be described using a regexp (where
regexps are defined to be implicitly anchored to beginning and end of
string, and support concatenation, alternative, and iteration, but not
backreferences). It then proves that for each regexp there exists an
equivalent nondeterministic finite automaton, and for every NDFA there
exists an equivalent DFA, and for every DFA there exists an equivalent
regexp. It also proves that the class of regular sets is closed under
set theory operations — union, intersection, and complement.
It follows that, theoretically, a regexp ‘R’ can be negated — one can
construct a regexp ‘(not R)’ that matches exactly all strings that R
does not match.
However, the proofs and constructions are complex enough that in
practice such a regexp would be unreadable.
As an example, consider the regexp ‘a’ which matches only a single
one-character string. Its complement would be a regexp that matches
the empty string, all one-character strings whose character is not
‘a’, and all strings longer than one character. The shortest way to
express that idea that I can think of is ‘|[^a]|.{2,}’ and that’s
ignoring the issue of newlines.
So, to steve-humphreys: There is no practical general way to negate a
regexp. You need to either negate the result of attempting to match,
or to think hard and write a new regexp that matches what you want.
next parent reply other threads:[~2021-05-20 10:29 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <trinity-0830bc3f-4c8b-426b-a877-e77dd5fb4176-1621441589393@3c-app-mailcom-bs08>
[not found] ` <CAP_d_8VjkuKz90t4CH9OJcbNSRNdrkZhMDctsLEMGqd89y704g@mail.gmail.com>
[not found] ` <trinity-d5767d58-d51a-4ffb-adff-45a12b89a8c8-1621445950512@3c-app-mailcom-bs08>
[not found] ` <CAP_d_8WVnXJwjROg_9Uw19zcMNz5UgCSBvLMtMWP1R0LnzaF9Q@mail.gmail.com>
[not found] ` <20210519213207.GD4855@tuxteam.de>
[not found] ` <SA2PR10MB4474FF28E4D683171BC549DAF32B9@SA2PR10MB4474.namprd10.prod.outlook.com>
[not found] ` <trinity-dde94f00-fc0f-4a83-83a9-7f2182b9bc30-1621491999210@3c-app-mailcom-bs08>
[not found] ` <trinity-227e877b-c66e-4b57-859f-e22f03a48448-1621497598973@3c-app-mailcom-bs16>
[not found] ` <20210520082613.GC1127@tuxteam.de>
[not found] ` <trinity-b7e362ca-6469-4101-9a28-319c383d8d25-1621503778439@3c-app-mailcom-bs16>
[not found] ` <20210520095603.GD1127@tuxteam.de>
2021-05-20 10:29 ` Yuri Khan [this message]
2021-05-20 10:39 ` Negating a regexp tomas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAP_d_8UJ5xwPG9=jQyjK=vz5vVRYuA6pvPna+uVg5PvjvgBvAQ@mail.gmail.com' \
--to=yuri.v.khan@gmail.com \
--cc=emacs-tangents@gnu.org \
--cc=steve-humphreys@gmx.com \
--cc=tomas@tuxteam.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).