unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Regex negative lookahead failed
@ 2021-04-24  4:32 Erwan Hingant
  2021-04-24 10:57 ` David Bremner
  0 siblings, 1 reply; 5+ messages in thread
From: Erwan Hingant @ 2021-04-24  4:32 UTC (permalink / raw)
  To: notmuch

Hello,

  I have some troubles with regex negative lookahead. When searching 
in 
all directories but one (say inbox), I do the following query:

 > notmuch search folder:"/^(?!inbox)/"

and I receive the following error:

     notmuch search: A Xapian exception occurred
     A Xapian exception occurred parsing query: Regexp error: Invalid 
preceding regular expression
     Query string was: folder:"/^(?!inbox)/"

Bests,

Erwan.


\r

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Regex negative lookahead failed
  2021-04-24  4:32 Regex negative lookahead failed Erwan Hingant
@ 2021-04-24 10:57 ` David Bremner
  2021-04-24 12:34   ` Michael J Gruber
  2021-04-24 16:52   ` Erwan Hingant
  0 siblings, 2 replies; 5+ messages in thread
From: David Bremner @ 2021-04-24 10:57 UTC (permalink / raw)
  To: Erwan Hingant, notmuch

Erwan Hingant <erwan.hingant@mailo.com> writes:

> Hello,
>
>   I have some troubles with regex negative lookahead. When searching 
> in 
> all directories but one (say inbox), I do the following query:
>
>  > notmuch search folder:"/^(?!inbox)/"
>

As far as I know, lookaheads are not supported by POSIX regex, which is
what notmuch uses. regex(7) is a bit terse, but the relevant section
seems to be

       An atom is a regular expression enclosed in "()" (matching a match  for
       the  regular  expression),  an  empty  set  of  "()" (matching the null
       string)(!), a bracket expression (see below), '.' (matching any  single
       character),  '^' (matching the null string at the beginning of a line),
       '$' (matching the null string at the end of a line), a '\' followed  by
       one  of the characters "^.[$()|*+?{\" (matching that character taken as
       an ordinary character),  a  '\'  followed  by  any  other  character(!)
       (matching  that character taken as an ordinary character, as if the '\'
       had not been present(!)), or a single character with no other  signifi‐
       cance  (matching  that character).  A '{' followed by a character other
       than a digit is an ordinary character, not the beginning of a bound(!).
       It is illegal to end an RE with '\'.

I'm not sure your whole problem, but maybe regex is not the right answer
here. In general they should be a last resort in notmuch, for efficiency
reasons. Does "not folder:inbox"  do what you want?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Regex negative lookahead failed
  2021-04-24 10:57 ` David Bremner
@ 2021-04-24 12:34   ` Michael J Gruber
  2021-04-24 16:03     ` David Bremner
  2021-04-24 16:52   ` Erwan Hingant
  1 sibling, 1 reply; 5+ messages in thread
From: Michael J Gruber @ 2021-04-24 12:34 UTC (permalink / raw)
  To: David Bremner, Erwan Hingant, notmuch

David Bremner venit, vidit, dixit 2021-04-24 12:57:17:
> Erwan Hingant <erwan.hingant@mailo.com> writes:
> 
> > Hello,
> >
> >   I have some troubles with regex negative lookahead. When searching 
> > in 
> > all directories but one (say inbox), I do the following query:
> >
> >  > notmuch search folder:"/^(?!inbox)/"
> >
> 
> As far as I know, lookaheads are not supported by POSIX regex, which is
> what notmuch uses. regex(7) is a bit terse, but the relevant section
> seems to be
> 
>        An atom is a regular expression enclosed in "()" (matching a match  for
>        the  regular  expression),  an  empty  set  of  "()" (matching the null
>        string)(!), a bracket expression (see below), '.' (matching any  single
>        character),  '^' (matching the null string at the beginning of a line),
>        '$' (matching the null string at the end of a line), a '\' followed  by
>        one  of the characters "^.[$()|*+?{\" (matching that character taken as
>        an ordinary character),  a  '\'  followed  by  any  other  character(!)
>        (matching  that character taken as an ordinary character, as if the '\'
>        had not been present(!)), or a single character with no other  signifi‐
>        cance  (matching  that character).  A '{' followed by a character other
>        than a digit is an ordinary character, not the beginning of a bound(!).
>        It is illegal to end an RE with '\'.
> 
> I'm not sure your whole problem, but maybe regex is not the right answer
> here. In general they should be a last resort in notmuch, for efficiency
> reasons. Does "not folder:inbox"  do what you want?

Yes, that solves the problem as stated (all folders except inbox).

As for the error message: both "^" and "(?!inbox)" are assertions in
regex lingo, not atoms. A negative assertion like "(?!inbox)" asserts
whether the previous atom is considered a match or not (under the
assumption that the atom matches something).

The error message seems to say that you cannot use an assertion
without a preceding atom that it applies to.

But, assertions are really for the case where you want to base a match
result (true/false) on parts of the expression that you do not want to
be part of the match itself. As far as I understand, matching in xapian
is solely about the true/false result, so I'm wondering when you would
really need lookahead and lookbehind (instead of groups).

Michael

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Regex negative lookahead failed
  2021-04-24 12:34   ` Michael J Gruber
@ 2021-04-24 16:03     ` David Bremner
  0 siblings, 0 replies; 5+ messages in thread
From: David Bremner @ 2021-04-24 16:03 UTC (permalink / raw)
  To: Michael J Gruber, Erwan Hingant, notmuch

Michael J Gruber <git@grubix.eu> writes:

> But, assertions are really for the case where you want to base a match
> result (true/false) on parts of the expression that you do not want to
> be part of the match itself. As far as I understand, matching in xapian
> is solely about the true/false result, so I'm wondering when you would
> really need lookahead and lookbehind (instead of groups).

Pedantic correction, the regex support in notmuch is not provided by
Xapian, but something bolted on by notmuch. That explains some (but not
all!) of its quirks.

d

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Regex negative lookahead failed
  2021-04-24 10:57 ` David Bremner
  2021-04-24 12:34   ` Michael J Gruber
@ 2021-04-24 16:52   ` Erwan Hingant
  1 sibling, 0 replies; 5+ messages in thread
From: Erwan Hingant @ 2021-04-24 16:52 UTC (permalink / raw)
  To: David Bremner, notmuch


Hello,

 Thanks a lot, this makes all clear and notmuch-search-terms(7) point the
 regex POSIX version you mention. My mistake. By the way, you completely
 solve my problem that of course no need regular expression has I
 thought...

Bests,
Erwan.

David Bremner <david@tethera.net> writes:

> Erwan Hingant <erwan.hingant@mailo.com> writes:
>
>> Hello,
>>
>>   I have some troubles with regex negative lookahead. When searching 
>> in 
>> all directories but one (say inbox), I do the following query:
>>
>>  > notmuch search folder:"/^(?!inbox)/"
>>
>
> As far as I know, lookaheads are not supported by POSIX regex, which is
> what notmuch uses. regex(7) is a bit terse, but the relevant section
> seems to be
>
>        An atom is a regular expression enclosed in "()" (matching a match  for
>        the  regular  expression),  an  empty  set  of  "()" (matching the null
>        string)(!), a bracket expression (see below), '.' (matching any  single
>        character),  '^' (matching the null string at the beginning of a line),
>        '$' (matching the null string at the end of a line), a '\' followed  by
>        one  of the characters "^.[$()|*+?{\" (matching that character taken as
>        an ordinary character),  a  '\'  followed  by  any  other  character(!)
>        (matching  that character taken as an ordinary character, as if the '\'
>        had not been present(!)), or a single character with no other  signifi‐
>        cance  (matching  that character).  A '{' followed by a character other
>        than a digit is an ordinary character, not the beginning of a bound(!).
>        It is illegal to end an RE with '\'.
>
> I'm not sure your whole problem, but maybe regex is not the right answer
> here. In general they should be a last resort in notmuch, for efficiency
> reasons. Does "not folder:inbox"  do what you want?
\r

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-24 16:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-24  4:32 Regex negative lookahead failed Erwan Hingant
2021-04-24 10:57 ` David Bremner
2021-04-24 12:34   ` Michael J Gruber
2021-04-24 16:03     ` David Bremner
2021-04-24 16:52   ` Erwan Hingant

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).