unofficial mirror of emacs-tangents@gnu.org
 help / color / mirror / Atom feed
From: Yuri Khan <yuri.v.khan@gmail.com>
To: tomas@tuxteam.de
Cc: emacs-tangents@gnu.org, steve-humphreys@gmx.com
Subject: Negating a regexp
Date: Thu, 20 May 2021 17:29:50 +0700	[thread overview]
Message-ID: <CAP_d_8UJ5xwPG9=jQyjK=vz5vVRYuA6pvPna+uVg5PvjvgBvAQ@mail.gmail.com> (raw)
In-Reply-To: <20210520095603.GD1127@tuxteam.de>

> > How could I negate the regexp that I have defined?
>
> I don't even know what you mean by "negating a regexp".

The automata theory, where the notion of a regexp comes from, defines
a “regular set” as one that can be described using a regexp (where
regexps are defined to be implicitly anchored to beginning and end of
string, and support concatenation, alternative, and iteration, but not
backreferences). It then proves that for each regexp there exists an
equivalent nondeterministic finite automaton, and for every NDFA there
exists an equivalent DFA, and for every DFA there exists an equivalent
regexp. It also proves that the class of regular sets is closed under
set theory operations — union, intersection, and complement.

It follows that, theoretically, a regexp ‘R’ can be negated — one can
construct a regexp ‘(not R)’ that matches exactly all strings that R
does not match.

However, the proofs and constructions are complex enough that in
practice such a regexp would be unreadable.

As an example, consider the regexp ‘a’ which matches only a single
one-character string. Its complement would be a regexp that matches
the empty string, all one-character strings whose character is not
‘a’, and all strings longer than one character. The shortest way to
express that idea that I can think of is ‘|[^a]|.{2,}’ and that’s
ignoring the issue of newlines.

So, to steve-humphreys: There is no practical general way to negate a
regexp. You need to either negate the result of attempting to match,
or to think hard and write a new regexp that matches what you want.



       reply	other threads:[~2021-05-20 10:29 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <trinity-0830bc3f-4c8b-426b-a877-e77dd5fb4176-1621441589393@3c-app-mailcom-bs08>
     [not found] ` <CAP_d_8VjkuKz90t4CH9OJcbNSRNdrkZhMDctsLEMGqd89y704g@mail.gmail.com>
     [not found]   ` <trinity-d5767d58-d51a-4ffb-adff-45a12b89a8c8-1621445950512@3c-app-mailcom-bs08>
     [not found]     ` <CAP_d_8WVnXJwjROg_9Uw19zcMNz5UgCSBvLMtMWP1R0LnzaF9Q@mail.gmail.com>
     [not found]       ` <20210519213207.GD4855@tuxteam.de>
     [not found]         ` <SA2PR10MB4474FF28E4D683171BC549DAF32B9@SA2PR10MB4474.namprd10.prod.outlook.com>
     [not found]           ` <trinity-dde94f00-fc0f-4a83-83a9-7f2182b9bc30-1621491999210@3c-app-mailcom-bs08>
     [not found]             ` <trinity-227e877b-c66e-4b57-859f-e22f03a48448-1621497598973@3c-app-mailcom-bs16>
     [not found]               ` <20210520082613.GC1127@tuxteam.de>
     [not found]                 ` <trinity-b7e362ca-6469-4101-9a28-319c383d8d25-1621503778439@3c-app-mailcom-bs16>
     [not found]                   ` <20210520095603.GD1127@tuxteam.de>
2021-05-20 10:29                     ` Yuri Khan [this message]
2021-05-20 10:39                       ` Negating a regexp tomas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP_d_8UJ5xwPG9=jQyjK=vz5vVRYuA6pvPna+uVg5PvjvgBvAQ@mail.gmail.com' \
    --to=yuri.v.khan@gmail.com \
    --cc=emacs-tangents@gnu.org \
    --cc=steve-humphreys@gmx.com \
    --cc=tomas@tuxteam.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).