unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Apparently, terms with a common prefix are *not* connected by implicit "OR"
@ 2019-08-11 17:20 Jorge P. de Morais Neto
  2019-08-11 23:08 ` David Bremner
  0 siblings, 1 reply; 10+ messages in thread
From: Jorge P. de Morais Neto @ 2019-08-11 17:20 UTC (permalink / raw)
  To: notmuch

Hi.  The NOTMUCH-SEARCH-TERMS man page says:

    Each term in the query will be implicitly connected by a logical AND
    if no explicit operator is provided (except that terms with a common
    prefix will be implicitly combined with OR).

However, in practice I get different results:
    $ notmuch count '(to:pontodosconcursos.com.br OR to:jorge+cp+concurso@disroot.org)'
    66
    $ notmuch count '(to:pontodosconcursos.com.br to:jorge+cp+concurso@disroot.org)'
    0

I have other examples of this.  I currently use notmuch 0.29.1 privately
backported to Debian buster according to the Debian Wiki [procedure][],
but I already got this problem in earlier notmuch releases, including
the official Debian buster package.  What gives?

[procedure]: https://wiki.debian.org/SimpleBackportCreation

My workarund is to always explicit the "OR" operator, as in:
    (to:concurso OR to:dominandoti OR to:cebraspe OR to:gleyson131 OR to:quadrix.org.br OR to:cathedranet.com.br OR from:cathedranet.com.br OR from:quadrix.org.br OR from:gleyson131 OR from:cebraspe OR from:dominandoti OR from:concurso OR is:lists/itaconcursos)

If the OR operator was indeed implicit, the query above would be
shorter.

Regards
-- 
- I am Brazilian.  I hope my English is correct and I welcome feedback
- Please adopt free formats like PDF, ODF, Org, LaTeX, Opus, WebM and 7z
- Free/libre software for Android: https://f-droid.org/
- [[https://www.gnu.org/philosophy/free-sw.html][What is free software?]]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Apparently, terms with a common prefix are *not* connected by implicit "OR"
  2019-08-11 17:20 Apparently, terms with a common prefix are *not* connected by implicit "OR" Jorge P. de Morais Neto
@ 2019-08-11 23:08 ` David Bremner
  2019-08-15 14:58   ` Jorge P. de Morais Neto
  0 siblings, 1 reply; 10+ messages in thread
From: David Bremner @ 2019-08-11 23:08 UTC (permalink / raw)
  To: Jorge P. de Morais Neto, notmuch

jorge+list@disroot.org (Jorge P. de Morais Neto) writes:

> Hi.  The NOTMUCH-SEARCH-TERMS man page says:
>
>     Each term in the query will be implicitly connected by a logical AND
>     if no explicit operator is provided (except that terms with a common
>     prefix will be implicitly combined with OR).
>
> However, in practice I get different results:
>     $ notmuch count '(to:pontodosconcursos.com.br OR to:jorge+cp+concurso@disroot.org)'
>     66
>     $ notmuch count '(to:pontodosconcursos.com.br to:jorge+cp+concurso@disroot.org)'
>     0
>

Thanks for the report. As a test, can you try with

     $ notmuch count '(to:pontodosconcursos.com.br to:"jorge+cp+concurso@disroot.org")'

I suspect that will work around the problem, which I believe is related
to the way that notmuch uses the xapian parser (in order to provide
regexp matching for some prefixes). In particular, if I try that with
NOTMUCH_DEBUG_QUERY=yes in the environment I can see the implicit OR.

d

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Apparently, terms with a common prefix are *not* connected by implicit "OR"
  2019-08-11 23:08 ` David Bremner
@ 2019-08-15 14:58   ` Jorge P. de Morais Neto
  2019-08-21  1:39     ` David Bremner
  0 siblings, 1 reply; 10+ messages in thread
From: Jorge P. de Morais Neto @ 2019-08-15 14:58 UTC (permalink / raw)
  To: David Bremner, notmuch

[ I had replied to David Bremner alone so I didn't have to worry about
private information leakage.  But now I decided to clean the private
information and reply to the list too. ]

Em 2019-08-11T20:08:58-0300, David Bremner escreveu:
> Thanks for the report. As a test, can you try with
>
>      $ notmuch count '(to:pontodosconcursos.com.br to:"jorge+cp+concurso@disroot.org")'
>
> I suspect that will work around the problem, which I believe is related
> to the way that notmuch uses the xapian parser (in order to provide
> regexp matching for some prefixes). In particular, if I try that with
> NOTMUCH_DEBUG_QUERY=yes in the environment I can see the implicit OR.
>
> d
Thank you for the instructions, but they did not work.  Below are
several results from ~notmuch count~.

--8<---------------cut here---------------start------------->8---
$ notmuch count '(to:pontodosconcursos.com.br OR to:jorge+cp+concurso@disroot.org)'
70
$ notmuch count '(to:pontodosconcursos.com.br)'
0
$ notmuch count 'to:pontodosconcursos.com.br'
0
$ notmuch count 'to:jorge+cp+concurso@disroot.org'
70
$ NOTMUCH_DEBUG_QUERY=yes notmuch count '(to:pontodosconcursos.com.br OR to:jorge+cp+concurso@disroot.org)'
Query string is:
(to:pontodosconcursos.com.br OR to:jorge+cp+concurso@disroot.org)
Exclude query is:
Query(((Kdeleted OR Kspam) OR Ktrash))
Final query is:
Query(((Tmail AND ((XTOpontodosconcursos@1 PHRASE 3 XTOcom@2 PHRASE 3 XTObr@3) OR (ZXTOjorg@4 AND (Zcp@5 OR ZGcp@5 OR ZKcp@5 OR ZKcp@5 OR ZQcp@5 OR ZQcp@5 OR ZPcp@5 OR ZXPROPERTYcp@5 OR
ZXFOLDER:cp@5 OR ZXFROMcp@5 OR ZXTOcp@5 OR ZXATTACHMENTcp@5 OR ZXMIMETYPEcp@5 OR ZXSUBJECTcp@5) AND ((concurso@6 PHRASE 3 disroot@7 PHRASE 3 org@8) OR (Gconcurso@6 PHRASE 3 Gdisroot@7
PHRASE 3 Gorg@8) OR (Kconcurso@6 PHRASE 3 Kdisroot@7 PHRASE 3 Korg@8) OR (Kconcurso@6 PHRASE 3 Kdisroot@7 PHRASE 3 Korg@8) OR (Qconcurso@6 PHRASE 3 Qdisroot@7 PHRASE 3 Qorg@8) OR
(Qconcurso@6 PHRASE 3 Qdisroot@7 PHRASE 3 Qorg@8) OR (Pconcurso@6 PHRASE 3 Pdisroot@7 PHRASE 3 Porg@8) OR (XPROPERTYconcurso@6 PHRASE 3 XPROPERTYdisroot@7 PHRASE 3 XPROPERTYorg@8) OR
(XFOLDER:concurso@6 PHRASE 3 XFOLDER:disroot@7 PHRASE 3 XFOLDER:org@8) OR (XFROMconcurso@6 PHRASE 3 XFROMdisroot@7 PHRASE 3 XFROMorg@8) OR (XTOconcurso@6 PHRASE 3 XTOdisroot@7 PHRASE 3
XTOorg@8) OR (XATTACHMENTconcurso@6 PHRASE 3 XATTACHMENTdisroot@7 PHRASE 3 XATTACHMENTorg@8) OR (XMIMETYPEconcurso@6 PHRASE 3 XMIMETYPEdisroot@7 PHRASE 3 XMIMETYPEorg@8) OR
(XSUBJECTconcurso@6 PHRASE 3 XSUBJECTdisroot@7 PHRASE 3 XSUBJECTorg@8))))) AND_NOT ((Kdeleted OR Kspam) OR Ktrash)))
70
$ NOTMUCH_DEBUG_QUERY=yes notmuch count '(to:pontodosconcursos.com.br to:jorge+cp+concurso@disroot.org)'
Query string is:
(to:pontodosconcursos.com.br to:jorge+cp+concurso@disroot.org)
Exclude query is:
Query(((Kdeleted OR Kspam) OR Ktrash))
Final query is:
Query(((Tmail AND ((XTOpontodosconcursos@1 PHRASE 3 XTOcom@2 PHRASE 3 XTObr@3) AND ZXTOjorg@4 AND (Zcp@5 OR ZGcp@5 OR ZKcp@5 OR ZKcp@5 OR ZQcp@5 OR ZQcp@5 OR ZPcp@5 OR ZXPROPERTYcp@5 OR
ZXFOLDER:cp@5 OR ZXFROMcp@5 OR ZXTOcp@5 OR ZXATTACHMENTcp@5 OR ZXMIMETYPEcp@5 OR ZXSUBJECTcp@5) AND ((concurso@6 PHRASE 3 disroot@7 PHRASE 3 org@8) OR (Gconcurso@6 PHRASE 3 Gdisroot@7
PHRASE 3 Gorg@8) OR (Kconcurso@6 PHRASE 3 Kdisroot@7 PHRASE 3 Korg@8) OR (Kconcurso@6 PHRASE 3 Kdisroot@7 PHRASE 3 Korg@8) OR (Qconcurso@6 PHRASE 3 Qdisroot@7 PHRASE 3 Qorg@8) OR
(Qconcurso@6 PHRASE 3 Qdisroot@7 PHRASE 3 Qorg@8) OR (Pconcurso@6 PHRASE 3 Pdisroot@7 PHRASE 3 Porg@8) OR (XPROPERTYconcurso@6 PHRASE 3 XPROPERTYdisroot@7 PHRASE 3 XPROPERTYorg@8) OR
(XFOLDER:concurso@6 PHRASE 3 XFOLDER:disroot@7 PHRASE 3 XFOLDER:org@8) OR (XFROMconcurso@6 PHRASE 3 XFROMdisroot@7 PHRASE 3 XFROMorg@8) OR (XTOconcurso@6 PHRASE 3 XTOdisroot@7 PHRASE 3
XTOorg@8) OR (XATTACHMENTconcurso@6 PHRASE 3 XATTACHMENTdisroot@7 PHRASE 3 XATTACHMENTorg@8) OR (XMIMETYPEconcurso@6 PHRASE 3 XMIMETYPEdisroot@7 PHRASE 3 XMIMETYPEorg@8) OR
(XSUBJECTconcurso@6 PHRASE 3 XSUBJECTdisroot@7 PHRASE 3 XSUBJECTorg@8)))) AND_NOT ((Kdeleted OR Kspam) OR Ktrash)))
0
$ NOTMUCH_DEBUG_QUERY=yes notmuch count '(to:pontodosconcursos.com.br to:"jorge+cp+concurso@disroot.org")'
Query string is:
(to:pontodosconcursos.com.br to:"jorge+cp+concurso@disroot.org")
Exclude query is:
Query(((Kdeleted OR Kspam) OR Ktrash))
Final query is:
Query(((Tmail AND ((XTOpontodosconcursos@1 PHRASE 3 XTOcom@2 PHRASE 3 XTObr@3) AND (XTOjorge@4 PHRASE 5 XTOcp@5 PHRASE 5 XTOconcurso@6 PHRASE 5 XTOdisroot@7 PHRASE 5 XTOorg@8))) AND_NOT
((Kdeleted OR Kspam) OR Ktrash)))
0
$ NOTMUCH_DEBUG_QUERY=yes notmuch count '(to:pontodosconcursos.com.br OR to:"jorge+cp+concurso@disroot.org")'
Query string is:
(to:pontodosconcursos.com.br OR to:"jorge+cp+concurso@disroot.org")
Exclude query is:
Query(((Kdeleted OR Kspam) OR Ktrash))
Final query is:
Query(((Tmail AND ((XTOpontodosconcursos@1 PHRASE 3 XTOcom@2 PHRASE 3 XTObr@3) OR (XTOjorge@4 PHRASE 5 XTOcp@5 PHRASE 5 XTOconcurso@6 PHRASE 5 XTOdisroot@7 PHRASE 5 XTOorg@8))) AND_NOT
((Kdeleted OR Kspam) OR Ktrash)))
69
$ NOTMUCH_DEBUG_QUERY=yes notmuch count '(to:pontodosconcursos.com.br to:"jorge+cp+concurso@disroot.org")'
Query string is:
(to:pontodosconcursos.com.br to:"jorge+cp+concurso@disroot.org")
Exclude query is:
Query(((Kdeleted OR Kspam) OR Ktrash))
Final query is:
Query(((Tmail AND ((XTOpontodosconcursos@1 PHRASE 3 XTOcom@2 PHRASE 3 XTObr@3) AND (XTOjorge@4 PHRASE 5 XTOcp@5 PHRASE 5 XTOconcurso@6 PHRASE 5 XTOdisroot@7 PHRASE 5 XTOorg@8))) AND_NOT
((Kdeleted OR Kspam) OR Ktrash)))
0
$ NOTMUCH_DEBUG_QUERY=yes notmuch count '(to:"pontodosconcursos.com.br" to:"jorge+cp+concurso@disroot.org")'
Query string is:
(to:"pontodosconcursos.com.br" to:"jorge+cp+concurso@disroot.org")
Exclude query is:
Query(((Kdeleted OR Kspam) OR Ktrash))
Final query is:
Query(((Tmail AND ((XTOpontodosconcursos@1 PHRASE 3 XTOcom@2 PHRASE 3 XTObr@3) AND (XTOjorge@4 PHRASE 5 XTOcp@5 PHRASE 5 XTOconcurso@6 PHRASE 5 XTOdisroot@7 PHRASE 5 XTOorg@8))) AND_NOT
((Kdeleted OR Kspam) OR Ktrash)))
0
$ NOTMUCH_DEBUG_QUERY=yes notmuch count 'to:"pontodosconcursos.com.br" to:"jorge+cp+concurso@disroot.org"'
Query string is:
to:"pontodosconcursos.com.br" to:"jorge+cp+concurso@disroot.org"
Exclude query is:
Query(((Kdeleted OR Kspam) OR Ktrash))
Final query is:
Query(((Tmail AND ((XTOpontodosconcursos@1 PHRASE 3 XTOcom@2 PHRASE 3 XTObr@3) AND (XTOjorge@4 PHRASE 5 XTOcp@5 PHRASE 5 XTOconcurso@6 PHRASE 5 XTOdisroot@7 PHRASE 5 XTOorg@8))) AND_NOT
((Kdeleted OR Kspam) OR Ktrash)))
0
$ NOTMUCH_DEBUG_QUERY=yes notmuch count 'to:jorge+cp+concurso@disroot.org XOR to:"jorge+cp+concurso@disroot.org"'
Query string is:
to:jorge+cp+concurso@disroot.org XOR to:"jorge+cp+concurso@disroot.org"
Exclude query is:
Query(((Kdeleted OR Kspam) OR Ktrash))
Final query is:
Query(((Tmail AND ((ZXTOjorg@1 AND (Zcp@2 OR ZGcp@2 OR ZKcp@2 OR ZKcp@2 OR ZQcp@2 OR ZQcp@2 OR ZPcp@2 OR ZXPROPERTYcp@2 OR ZXFOLDER:cp@2 OR ZXFROMcp@2 OR ZXTOcp@2 OR ZXATTACHMENTcp@2 OR
ZXMIMETYPEcp@2 OR ZXSUBJECTcp@2) AND ((concurso@3 PHRASE 3 disroot@4 PHRASE 3 org@5) OR (Gconcurso@3 PHRASE 3 Gdisroot@4 PHRASE 3 Gorg@5) OR (Kconcurso@3 PHRASE 3 Kdisroot@4 PHRASE 3
Korg@5) OR (Kconcurso@3 PHRASE 3 Kdisroot@4 PHRASE 3 Korg@5) OR (Qconcurso@3 PHRASE 3 Qdisroot@4 PHRASE 3 Qorg@5) OR (Qconcurso@3 PHRASE 3 Qdisroot@4 PHRASE 3 Qorg@5) OR (Pconcurso@3
PHRASE 3 Pdisroot@4 PHRASE 3 Porg@5) OR (XPROPERTYconcurso@3 PHRASE 3 XPROPERTYdisroot@4 PHRASE 3 XPROPERTYorg@5) OR (XFOLDER:concurso@3 PHRASE 3 XFOLDER:disroot@4 PHRASE 3 XFOLDER:org@5)
OR (XFROMconcurso@3 PHRASE 3 XFROMdisroot@4 PHRASE 3 XFROMorg@5) OR (XTOconcurso@3 PHRASE 3 XTOdisroot@4 PHRASE 3 XTOorg@5) OR (XATTACHMENTconcurso@3 PHRASE 3 XATTACHMENTdisroot@4 PHRASE 3
XATTACHMENTorg@5) OR (XMIMETYPEconcurso@3 PHRASE 3 XMIMETYPEdisroot@4 PHRASE 3 XMIMETYPEorg@5) OR (XSUBJECTconcurso@3 PHRASE 3 XSUBJECTdisroot@4 PHRASE 3 XSUBJECTorg@5))) XOR (XTOjorge@6
PHRASE 5 XTOcp@7 PHRASE 5 XTOconcurso@8 PHRASE 5 XTOdisroot@9 PHRASE 5 XTOorg@10))) AND_NOT ((Kdeleted OR Kspam) OR Ktrash)))
1
--8<---------------cut here---------------end--------------->8---

Below are the contents of ~/home/jorge/.config/notmuch/config~.

--8<---------------cut here---------------start------------->8---
# .notmuch-config - Configuration file for the notmuch mail system
#
# For more information about notmuch, see https://notmuchmail.org

# Database configuration
#
# The only value supported here is 'path' which should be the top-level
# directory where your mail currently exists and to where mail will be
# delivered in the future. Files should be individual email messages.
# Notmuch will store its database within a sub-directory of the path
# configured here named ".notmuch".
#

[database]
path=/home/jorge/offlineimap/Jorge-Disroot

# User configuration
#
# Here is where you can let notmuch know how you would like to be
# addressed. Valid settings are
#
#	name		Your full name.
#	primary_email	Your primary email address.
#	other_email	A list (separated by ';') of other email addresses
#			at which you receive email.
#
# Notmuch will use the various email addresses configured here when
# formatting replies. It will avoid including your own addresses in the
# recipient list of replies, and will set the From address based on the
# address to which the original email was addressed.
#
[user]
name=Jorge P. de Morais Neto
primary_email= < jorge AT_SIGN disroot DOT org>
other_email=<REDACTED;<REDACTED>;<REDACTED>;<REDACTED>;<jorge PLUS-SIGN list AT_SIGN disroot DOT org;<REDACTED>;<REDACTED>;<REDACTED>;<REDACTED>;

# Configuration for "notmuch new"
#
# The following options are supported here:
#
#	tags	A list (separated by ';') of the tags that will be
#		added to all messages incorporated by "notmuch new".
#
#	ignore	A list (separated by ';') of file and directory names
#		that will not be searched for messages by "notmuch new".
#
#		NOTE: *Every* file/directory that goes by one of those
#		names will be ignored, independent of its depth/location
#		in the mail store.
#
[new]
# tags=unread;inbox;new;
# http://afew.readthedocs.io/en/latest/quickstart.html#initial-config
tags=new
ignore=

# Search configuration
#
# The following option is supported here:
#
#	exclude_tags
#		A ;-separated list of tags that will be excluded from
#		search results by default.  Using an excluded tag in a
#		query will override that exclusion.
#
[search]
exclude_tags=deleted;spam;trash;

# Maildir compatibility configuration
#
# The following option is supported here:
#
#	synchronize_flags      Valid values are true and false.
#
#	If true, then the following maildir flags (in message filenames)
#	will be synchronized with the corresponding notmuch tags:
#
#		Flag	Tag
#		----	-------
#		D	draft
#		F	flagged
#		P	passed
#		R	replied
#		S	unread (added when 'S' flag is not present)
#
#	The "notmuch new" command will notice flag changes in filenames
#	and update tags, while the "notmuch tag" and "notmuch restore"
#	commands will notice tag changes and update flags in filenames
#
[maildir]
synchronize_flags=true

# Cryptography related configuration
#
# The following *deprecated* option is currently supported:
#
#	gpg_path
#		binary name or full path to invoke gpg.
#		NOTE: In a future build, this option will be ignored.
#		Setting $PATH is a better approach.
#
[crypto]
gpg_path=gpg
--8<---------------cut here---------------end--------------->8---

 Regards
-- 
- I am Brazilian.  I hope my English is correct and I welcome feedback
- Please adopt free formats like PDF, ODF, Org, LaTeX, Opus, WebM and 7z
- Free/libre software for Android: https://f-droid.org/
- [[https://www.gnu.org/philosophy/free-sw.html][What is free software?]]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Apparently, terms with a common prefix are *not* connected by implicit "OR"
  2019-08-15 14:58   ` Jorge P. de Morais Neto
@ 2019-08-21  1:39     ` David Bremner
  2019-08-21  5:48       ` Rollins, Jameson
  0 siblings, 1 reply; 10+ messages in thread
From: David Bremner @ 2019-08-21  1:39 UTC (permalink / raw)
  To: Jorge P. de Morais Neto, notmuch

Jorge P. de Morais Neto <jorge+list@disroot.org> writes:

> [ I had replied to David Bremner alone so I didn't have to worry about
> private information leakage.  But now I decided to clean the private
> information and reply to the list too. ]
>
> Em 2019-08-11T20:08:58-0300, David Bremner escreveu:
>> Thanks for the report. As a test, can you try with
>>
>>      $ notmuch count '(to:pontodosconcursos.com.br to:"jorge+cp+concurso@disroot.org")'
>>
>> I suspect that will work around the problem, which I believe is related
>> to the way that notmuch uses the xapian parser (in order to provide
>> regexp matching for some prefixes). In particular, if I try that with
>> NOTMUCH_DEBUG_QUERY=yes in the environment I can see the implicit OR.

Thanks for the detailed report. There are (at least) two different
things going on (in addition to the strange expansion that I focussed on
before, but seems not to be the most important issue).

One is that the combining with implicit-OR was only intended to work for
"boolean prefixes" like tag:. So this is a documentation bug.

A second thing is due to some implimentation details in notmuch, from:
was being treated (for purposes of combining) as a filter. I think it's
clear we want from: and to: to behave similarly, so I propose the
following patch

diff --git a/lib/database.cc b/lib/database.cc
index 24b7ec43..4db1b465 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -400,7 +400,7 @@ _setup_query_field (const prefix_t *prefix, notmuch_database_t *notmuch)
        /* we treat all field-processor fields as boolean in order to get the raw input */
        if (prefix->prefix)
            notmuch->query_parser->add_prefix ("", prefix->prefix);
-       notmuch->query_parser->add_boolean_prefix (prefix->name, fp);
+       notmuch->query_parser->add_boolean_prefix (prefix->name, fp, !(prefix->flags & NOTMUCH_FIELD_PROBABILISTIC));
     } else {
        _setup_query_field_default (prefix, notmuch);
     }


This will make

    to:a to:b

and

    from:a from:b

expand as

    to:a AND to:b

and

    from:a AND from:b


I don't think it's possible to have

  to:a to:b

expand to

   to:a OR to:b

without also having

   a b

expand to

   a OR b

which I think most people would find surprising.

At the moment I'm not sure I see the benefit of having tag: combine
with implicit OR (other than being slightly easier to document).

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Apparently, terms with a common prefix are *not* connected by implicit "OR"
  2019-08-21  1:39     ` David Bremner
@ 2019-08-21  5:48       ` Rollins, Jameson
  2019-08-21 11:41         ` WIP: fix implicit operators David Bremner
  0 siblings, 1 reply; 10+ messages in thread
From: Rollins, Jameson @ 2019-08-21  5:48 UTC (permalink / raw)
  To: David Bremner, Jorge P. de Morais Neto, notmuch@notmuchmail.org

On Tue, Aug 20 2019, David Bremner <david@tethera.net> wrote:
> This will make
>
>     to:a to:b
>
> and
>
>     from:a from:b
>
> expand as
>
>     to:a AND to:b
>
> and
>
>     from:a AND from:b

I can't say if the proposed parser prefix change is correct, but I
support this behavior.

jamie.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* WIP: fix implicit operators
  2019-08-21  5:48       ` Rollins, Jameson
@ 2019-08-21 11:41         ` David Bremner
  2019-08-21 11:41           ` [PATCH 1/3] test: add known broken tests for from: and subject: David Bremner
                             ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: David Bremner @ 2019-08-21 11:41 UTC (permalink / raw)
  To: Rollins, Jameson, David Bremner, Jorge P. de Morais Neto,
	notmuch@notmuchmail.org

I'm posting this for feedback.

I did all the tests that were easy, but things like threads and
attachments will take more effort to test.

for probabilistic prefixes, we are basically stuck with AND. For
boolean prefixes, we have the choice.  Generally I tried to follow the
Xapian convention that if a field/prefix is "exclusive" (one per
message), then we group with OR.  Regexp fields like mid:// are tricky
since even if the underlying data is exclusive it makes sense to use
multiple instances to match one piece of data, e.g.

mid:/david/ mid:/tethera/

should probably behave roughly like

mid:/david.*tethera/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] test: add known broken tests for from: and subject:
  2019-08-21 11:41         ` WIP: fix implicit operators David Bremner
@ 2019-08-21 11:41           ` David Bremner
  2019-08-25  9:33             ` Tomi Ollila
  2019-08-21 11:41           ` [PATCH 2/3] lib: introduce N_F_GROUP and use it to fix implicit AND for from: David Bremner
  2019-08-21 11:41           ` [PATCH 3/3] WIP/test: extend field grouping tests David Bremner
  2 siblings, 1 reply; 10+ messages in thread
From: David Bremner @ 2019-08-21 11:41 UTC (permalink / raw)
  To: Rollins, Jameson, David Bremner, Jorge P. de Morais Neto,
	notmuch@notmuchmail.org

Given we want 'a b' to parse as 'a AND b', then for any
probabilistic (free text) prefix foo:, we should also get 'foo:a
foo:b' expanding to 'foo:a AND foo:b'. Currently this is not true due
to the implimentation of regex fields.
---
 test/T760-implicit-operators.sh | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)
 create mode 100755 test/T760-implicit-operators.sh

diff --git a/test/T760-implicit-operators.sh b/test/T760-implicit-operators.sh
new file mode 100755
index 00000000..b79673df
--- /dev/null
+++ b/test/T760-implicit-operators.sh
@@ -0,0 +1,28 @@
+#!/usr/bin/env bash
+test_description='implicit operators in query parser'
+. $(dirname "$0")/test-lib.sh || exit 1
+
+test_AND() {
+    add_message  "[$1]=a@b"
+    add_message  "[$1]=b@c"
+
+    test_begin_subtest "$1: implicitly joined by AND"
+    $2
+    notmuch count $1:a@b > OUTPUT
+    notmuch count $1:a $1:b >> OUTPUT
+    notmuch count $1:a@b OR $1:b@c >> OUTPUT
+    notmuch count $1:a@b $1:b@c >> OUTPUT
+    cat <<EOF > EXPECTED
+1
+1
+2
+0
+EOF
+    test_expect_equal_file EXPECTED OUTPUT
+}
+
+test_AND from test_subtest_known_broken
+test_AND subject test_subtest_known_broken
+test_AND to
+
+test_done
-- 
2.23.0.rc1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] lib: introduce N_F_GROUP and use it to fix implicit AND for from:
  2019-08-21 11:41         ` WIP: fix implicit operators David Bremner
  2019-08-21 11:41           ` [PATCH 1/3] test: add known broken tests for from: and subject: David Bremner
@ 2019-08-21 11:41           ` David Bremner
  2019-08-21 11:41           ` [PATCH 3/3] WIP/test: extend field grouping tests David Bremner
  2 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2019-08-21 11:41 UTC (permalink / raw)
  To: Rollins, Jameson, David Bremner, Jorge P. de Morais Neto,
	notmuch@notmuchmail.org

This needs tests for every prefix and documentation
---
 lib/database-private.h          |  1 +
 lib/database.cc                 | 23 +++++++++++++++++------
 test/T760-implicit-operators.sh |  4 ++--
 3 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/lib/database-private.h b/lib/database-private.h
index 87ae1bdf..a5eb83cb 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -159,6 +159,7 @@ typedef enum notmuch_field_flags {
     NOTMUCH_FIELD_EXTERNAL	= 1 << 0,
     NOTMUCH_FIELD_PROBABILISTIC = 1 << 1,
     NOTMUCH_FIELD_PROCESSOR	= 1 << 2,
+    NOTMUCH_FIELD_GROUP		= 1 << 3,
 } notmuch_field_flag_t;
 
 /*
diff --git a/lib/database.cc b/lib/database.cc
index 24b7ec43..5d6c4ee6 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -272,16 +272,21 @@ prefix_t prefix_table[] = {
     { "body",                   "",             NOTMUCH_FIELD_EXTERNAL |
       NOTMUCH_FIELD_PROBABILISTIC },
     { "thread",                 "G",            NOTMUCH_FIELD_EXTERNAL |
-      NOTMUCH_FIELD_PROCESSOR },
+      NOTMUCH_FIELD_PROCESSOR |
+      NOTMUCH_FIELD_GROUP
+    },
     { "tag",                    "K",            NOTMUCH_FIELD_EXTERNAL |
       NOTMUCH_FIELD_PROCESSOR },
     { "is",                     "K",            NOTMUCH_FIELD_EXTERNAL |
       NOTMUCH_FIELD_PROCESSOR },
-    { "id",                     "Q",            NOTMUCH_FIELD_EXTERNAL },
+    { "id",                     "Q",            NOTMUCH_FIELD_EXTERNAL |
+      NOTMUCH_FIELD_GROUP },
     { "mid",                    "Q",            NOTMUCH_FIELD_EXTERNAL |
       NOTMUCH_FIELD_PROCESSOR },
     { "path",                   "P",            NOTMUCH_FIELD_EXTERNAL |
-      NOTMUCH_FIELD_PROCESSOR },
+      NOTMUCH_FIELD_PROCESSOR |
+      NOTMUCH_FIELD_GROUP
+    },
     { "property",               "XPROPERTY",    NOTMUCH_FIELD_EXTERNAL },
     /*
      * Unconditionally add ':' to reduce potential ambiguity with
@@ -290,10 +295,14 @@ prefix_t prefix_table[] = {
      * discussion.
      */
     { "folder",                 "XFOLDER:",     NOTMUCH_FIELD_EXTERNAL |
-      NOTMUCH_FIELD_PROCESSOR },
+      NOTMUCH_FIELD_PROCESSOR |
+      NOTMUCH_FIELD_GROUP
+    },
 #if HAVE_XAPIAN_FIELD_PROCESSOR
     { "date",                   NULL,           NOTMUCH_FIELD_EXTERNAL |
-      NOTMUCH_FIELD_PROCESSOR },
+      NOTMUCH_FIELD_PROCESSOR |
+      NOTMUCH_FIELD_GROUP
+    },
     { "query",                  NULL,           NOTMUCH_FIELD_EXTERNAL |
       NOTMUCH_FIELD_PROCESSOR },
 #endif
@@ -400,7 +409,9 @@ _setup_query_field (const prefix_t *prefix, notmuch_database_t *notmuch)
 	/* we treat all field-processor fields as boolean in order to get the raw input */
 	if (prefix->prefix)
 	    notmuch->query_parser->add_prefix ("", prefix->prefix);
-	notmuch->query_parser->add_boolean_prefix (prefix->name, fp);
+	notmuch->query_parser->add_boolean_prefix (prefix->name, fp,
+						   !(prefix->flags & NOTMUCH_FIELD_PROBABILISTIC) &&
+						   (prefix->flags & NOTMUCH_FIELD_GROUP));
     } else {
 	_setup_query_field_default (prefix, notmuch);
     }
diff --git a/test/T760-implicit-operators.sh b/test/T760-implicit-operators.sh
index b79673df..1a6ba61f 100755
--- a/test/T760-implicit-operators.sh
+++ b/test/T760-implicit-operators.sh
@@ -21,8 +21,8 @@ EOF
     test_expect_equal_file EXPECTED OUTPUT
 }
 
-test_AND from test_subtest_known_broken
-test_AND subject test_subtest_known_broken
+test_AND from
+test_AND subject
 test_AND to
 
 test_done
-- 
2.23.0.rc1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] WIP/test: extend field grouping tests
  2019-08-21 11:41         ` WIP: fix implicit operators David Bremner
  2019-08-21 11:41           ` [PATCH 1/3] test: add known broken tests for from: and subject: David Bremner
  2019-08-21 11:41           ` [PATCH 2/3] lib: introduce N_F_GROUP and use it to fix implicit AND for from: David Bremner
@ 2019-08-21 11:41           ` David Bremner
  2 siblings, 0 replies; 10+ messages in thread
From: David Bremner @ 2019-08-21 11:41 UTC (permalink / raw)
  To: Rollins, Jameson, David Bremner, Jorge P. de Morais Neto,
	notmuch@notmuchmail.org

---
 test/T760-implicit-operators.sh | 60 +++++++++++++++++++++++++++------
 1 file changed, 49 insertions(+), 11 deletions(-)

diff --git a/test/T760-implicit-operators.sh b/test/T760-implicit-operators.sh
index 1a6ba61f..438f766f 100755
--- a/test/T760-implicit-operators.sh
+++ b/test/T760-implicit-operators.sh
@@ -2,16 +2,16 @@
 test_description='implicit operators in query parser'
 . $(dirname "$0")/test-lib.sh || exit 1
 
-test_AND() {
-    add_message  "[$1]=a@b"
-    add_message  "[$1]=b@c"
+test_prob_AND() {
+    add_message  "[$1]=alpha@beta"
+    add_message  "[$1]=beta@gamma"
 
-    test_begin_subtest "$1: implicitly joined by AND"
+    test_begin_subtest "probabilistic field '$1:' implicitly joined by AND"
     $2
-    notmuch count $1:a@b > OUTPUT
-    notmuch count $1:a $1:b >> OUTPUT
-    notmuch count $1:a@b OR $1:b@c >> OUTPUT
-    notmuch count $1:a@b $1:b@c >> OUTPUT
+    notmuch count $1:alpha@beta > OUTPUT
+    notmuch count $1:alpha $1:beta >> OUTPUT
+    notmuch count $1:alpha@beta OR $1:beta@gamma >> OUTPUT
+    notmuch count $1:alpha@beta $1:beta@gamma >> OUTPUT
     cat <<EOF > EXPECTED
 1
 1
@@ -21,8 +21,46 @@ EOF
     test_expect_equal_file EXPECTED OUTPUT
 }
 
-test_AND from
-test_AND subject
-test_AND to
+test_regex_AND() {
+    test_begin_subtest "regex field '$1:' implicitly joined by AND"
+    $2
+    notmuch count $1:alpha@beta > OUTPUT
+    notmuch count $1:/alpha/ $1:/beta/ >> OUTPUT
+    notmuch count $1:alpha@beta OR $1:beta@gamma >> OUTPUT
+    notmuch count $1:alpha@beta $1:beta@gamma >> OUTPUT
+    cat <<EOF > EXPECTED
+1
+1
+2
+0
+EOF
+    test_expect_equal_file EXPECTED OUTPUT
+}
+
+test_prob_AND from
+test_prob_AND subject
+test_prob_AND to
+
+
+add_message  "[id]=alpha@beta"
+add_message  "[id]=beta@gamma"
+
+test_regex_AND mid
+
+test_begin_subtest "'id:' implicitly joined by OR"
+notmuch count id:alpha@beta > OUTPUT
+notmuch count id:alpha@beta OR id:beta@gamma >> OUTPUT
+notmuch count id:alpha@beta id:beta@gamma >> OUTPUT
+cat <<EOF > EXPECTED
+1
+2
+2
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
+notmuch tag +alpha@beta id:alpha@beta
+notmuch tag +beta@gamma id:beta@gamma
+
+test_regex_AND tag
 
 test_done
-- 
2.23.0.rc1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] test: add known broken tests for from: and subject:
  2019-08-21 11:41           ` [PATCH 1/3] test: add known broken tests for from: and subject: David Bremner
@ 2019-08-25  9:33             ` Tomi Ollila
  0 siblings, 0 replies; 10+ messages in thread
From: Tomi Ollila @ 2019-08-25  9:33 UTC (permalink / raw)
  To: notmuch@notmuchmail.org

On Wed, Aug 21 2019, David Bremner wrote:

> Given we want 'a b' to parse as 'a AND b', then for any
> probabilistic (free text) prefix foo:, we should also get 'foo:a
> foo:b' expanding to 'foo:a AND foo:b'. Currently this is not true due
> to the implimentation of regex fields.

implementation =D

> ---
>  test/T760-implicit-operators.sh | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
>  create mode 100755 test/T760-implicit-operators.sh
>
> diff --git a/test/T760-implicit-operators.sh b/test/T760-implicit-operators.sh
> new file mode 100755
> index 00000000..b79673df
> --- /dev/null
> +++ b/test/T760-implicit-operators.sh
> @@ -0,0 +1,28 @@
> +#!/usr/bin/env bash
> +test_description='implicit operators in query parser'
> +. $(dirname "$0")/test-lib.sh || exit 1
> +
> +test_AND() {
> +    add_message  "[$1]=a@b"
> +    add_message  "[$1]=b@c"
> +
> +    test_begin_subtest "$1: implicitly joined by AND"
> +    $2
> +    notmuch count $1:a@b > OUTPUT
> +    notmuch count $1:a $1:b >> OUTPUT
> +    notmuch count $1:a@b OR $1:b@c >> OUTPUT
> +    notmuch count $1:a@b $1:b@c >> OUTPUT
> +    cat <<EOF > EXPECTED
> +1
> +1
> +2
> +0
> +EOF

the above could be done  printf %s\\n  1  1  2  0  > EXPECTED

(whichever way is "clearer" -- using '%s\n' or even "%s\n" is 
also possible (just increasingly harder to write ;D))

Tomi

> +    test_expect_equal_file EXPECTED OUTPUT
> +}
> +
> +test_AND from test_subtest_known_broken
> +test_AND subject test_subtest_known_broken
> +test_AND to
> +
> +test_done
> -- 
> 2.23.0.rc1

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-08-25  9:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-11 17:20 Apparently, terms with a common prefix are *not* connected by implicit "OR" Jorge P. de Morais Neto
2019-08-11 23:08 ` David Bremner
2019-08-15 14:58   ` Jorge P. de Morais Neto
2019-08-21  1:39     ` David Bremner
2019-08-21  5:48       ` Rollins, Jameson
2019-08-21 11:41         ` WIP: fix implicit operators David Bremner
2019-08-21 11:41           ` [PATCH 1/3] test: add known broken tests for from: and subject: David Bremner
2019-08-25  9:33             ` Tomi Ollila
2019-08-21 11:41           ` [PATCH 2/3] lib: introduce N_F_GROUP and use it to fix implicit AND for from: David Bremner
2019-08-21 11:41           ` [PATCH 3/3] WIP/test: extend field grouping tests David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).