* Re: Regex negative lookahead failed
2021-04-24 10:57 ` David Bremner
@ 2021-04-24 12:34 ` Michael J Gruber
2021-04-24 16:03 ` David Bremner
2021-04-24 16:52 ` Erwan Hingant
1 sibling, 1 reply; 5+ messages in thread
From: Michael J Gruber @ 2021-04-24 12:34 UTC (permalink / raw)
To: David Bremner, Erwan Hingant, notmuch
David Bremner venit, vidit, dixit 2021-04-24 12:57:17:
> Erwan Hingant <erwan.hingant@mailo.com> writes:
>
> > Hello,
> >
> > I have some troubles with regex negative lookahead. When searching
> > in
> > all directories but one (say inbox), I do the following query:
> >
> > > notmuch search folder:"/^(?!inbox)/"
> >
>
> As far as I know, lookaheads are not supported by POSIX regex, which is
> what notmuch uses. regex(7) is a bit terse, but the relevant section
> seems to be
>
> An atom is a regular expression enclosed in "()" (matching a match for
> the regular expression), an empty set of "()" (matching the null
> string)(!), a bracket expression (see below), '.' (matching any single
> character), '^' (matching the null string at the beginning of a line),
> '$' (matching the null string at the end of a line), a '\' followed by
> one of the characters "^.[$()|*+?{\" (matching that character taken as
> an ordinary character), a '\' followed by any other character(!)
> (matching that character taken as an ordinary character, as if the '\'
> had not been present(!)), or a single character with no other signifi‐
> cance (matching that character). A '{' followed by a character other
> than a digit is an ordinary character, not the beginning of a bound(!).
> It is illegal to end an RE with '\'.
>
> I'm not sure your whole problem, but maybe regex is not the right answer
> here. In general they should be a last resort in notmuch, for efficiency
> reasons. Does "not folder:inbox" do what you want?
Yes, that solves the problem as stated (all folders except inbox).
As for the error message: both "^" and "(?!inbox)" are assertions in
regex lingo, not atoms. A negative assertion like "(?!inbox)" asserts
whether the previous atom is considered a match or not (under the
assumption that the atom matches something).
The error message seems to say that you cannot use an assertion
without a preceding atom that it applies to.
But, assertions are really for the case where you want to base a match
result (true/false) on parts of the expression that you do not want to
be part of the match itself. As far as I understand, matching in xapian
is solely about the true/false result, so I'm wondering when you would
really need lookahead and lookbehind (instead of groups).
Michael
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Regex negative lookahead failed
2021-04-24 10:57 ` David Bremner
2021-04-24 12:34 ` Michael J Gruber
@ 2021-04-24 16:52 ` Erwan Hingant
1 sibling, 0 replies; 5+ messages in thread
From: Erwan Hingant @ 2021-04-24 16:52 UTC (permalink / raw)
To: David Bremner, notmuch
Hello,
Thanks a lot, this makes all clear and notmuch-search-terms(7) point the
regex POSIX version you mention. My mistake. By the way, you completely
solve my problem that of course no need regular expression has I
thought...
Bests,
Erwan.
David Bremner <david@tethera.net> writes:
> Erwan Hingant <erwan.hingant@mailo.com> writes:
>
>> Hello,
>>
>> I have some troubles with regex negative lookahead. When searching
>> in
>> all directories but one (say inbox), I do the following query:
>>
>> > notmuch search folder:"/^(?!inbox)/"
>>
>
> As far as I know, lookaheads are not supported by POSIX regex, which is
> what notmuch uses. regex(7) is a bit terse, but the relevant section
> seems to be
>
> An atom is a regular expression enclosed in "()" (matching a match for
> the regular expression), an empty set of "()" (matching the null
> string)(!), a bracket expression (see below), '.' (matching any single
> character), '^' (matching the null string at the beginning of a line),
> '$' (matching the null string at the end of a line), a '\' followed by
> one of the characters "^.[$()|*+?{\" (matching that character taken as
> an ordinary character), a '\' followed by any other character(!)
> (matching that character taken as an ordinary character, as if the '\'
> had not been present(!)), or a single character with no other signifi‐
> cance (matching that character). A '{' followed by a character other
> than a digit is an ordinary character, not the beginning of a bound(!).
> It is illegal to end an RE with '\'.
>
> I'm not sure your whole problem, but maybe regex is not the right answer
> here. In general they should be a last resort in notmuch, for efficiency
> reasons. Does "not folder:inbox" do what you want?
\r
^ permalink raw reply [flat|nested] 5+ messages in thread