unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* revised foo:"" handling
@ 2017-03-24 22:52 David Bremner
  2017-03-24 22:52 ` [PATCH 1/2] test: add known broken test for null from: and subject: query David Bremner
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: David Bremner @ 2017-03-24 22:52 UTC (permalink / raw)
  To: notmuch

This obsoletes the first two patches of

     id:20170318030303.17344-1-david@tethera.net
     
I think this is a more meaningful interpretation than matching all messages.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] test: add known broken test for null from: and subject: query
  2017-03-24 22:52 revised foo:"" handling David Bremner
@ 2017-03-24 22:52 ` David Bremner
  2017-03-24 22:52 ` [PATCH 2/2] lib: handle empty string in regexp field processors David Bremner
  2017-03-29 20:07 ` revised foo:"" handling Tomi Ollila
  2 siblings, 0 replies; 6+ messages in thread
From: David Bremner @ 2017-03-24 22:52 UTC (permalink / raw)
  To: notmuch

These queries currently fail with field processors enabled because the
code expects a non-empty string.
---
 test/T650-regexp-query.sh | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/test/T650-regexp-query.sh b/test/T650-regexp-query.sh
index 61739e87..f2ae1387 100755
--- a/test/T650-regexp-query.sh
+++ b/test/T650-regexp-query.sh
@@ -11,6 +11,26 @@ fi
 
 notmuch search --output=messages from:cworth > cworth.msg-ids
 
+# these headers will generate no document terms
+add_message '[from]="-" [subject]="empty from"'
+add_message '[subject]="-"'
+
+test_begin_subtest "null from: search"
+test_subtest_known_broken
+notmuch search 'from:""' | notmuch_search_sanitize > OUTPUT
+cat <<EOF > EXPECTED
+thread:XXX   2001-01-05 [1/1] -; empty from (inbox unread)
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest "null subject: search"
+test_subtest_known_broken
+notmuch search 'subject:""' | notmuch_search_sanitize > OUTPUT
+cat <<EOF > EXPECTED
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; - (inbox unread)
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
 test_begin_subtest "xapian wildcard search for from:"
 notmuch search --output=messages 'from:cwo*' > OUTPUT
 test_expect_equal_file cworth.msg-ids OUTPUT
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] lib: handle empty string in regexp field processors
  2017-03-24 22:52 revised foo:"" handling David Bremner
  2017-03-24 22:52 ` [PATCH 1/2] test: add known broken test for null from: and subject: query David Bremner
@ 2017-03-24 22:52 ` David Bremner
  2017-03-25 11:30   ` David Bremner
  2017-03-29 20:07 ` revised foo:"" handling Tomi Ollila
  2 siblings, 1 reply; 6+ messages in thread
From: David Bremner @ 2017-03-24 22:52 UTC (permalink / raw)
  To: notmuch

The non-field processor behaviour is is convert the corresponding
queries into a search for the unprefixed terms. This yields pretty
surprising results so I decided to generate a query that would match
the terms (i.e. none with that prefix) generated for an empty header.
---
 lib/regexp-fields.cc      | 5 +++++
 test/T650-regexp-query.sh | 2 --
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/regexp-fields.cc b/lib/regexp-fields.cc
index 9dcf9732..1651677c 100644
--- a/lib/regexp-fields.cc
+++ b/lib/regexp-fields.cc
@@ -148,6 +148,11 @@ RegexpFieldProcessor::RegexpFieldProcessor (std::string prefix, Xapian::QueryPar
 Xapian::Query
 RegexpFieldProcessor::operator() (const std::string & str)
 {
+    if (str.size () == 0)
+	return Xapian::Query(Xapian::Query::OP_AND_NOT,
+			     Xapian::Query::MatchAll,
+			     Xapian::Query (Xapian::Query::OP_WILDCARD, term_prefix));
+
     if (str.at (0) == '/') {
 	if (str.at (str.size () - 1) == '/'){
 	    RegexpPostingSource *postings = new RegexpPostingSource (slot, str.substr(1,str.size () - 2));
diff --git a/test/T650-regexp-query.sh b/test/T650-regexp-query.sh
index f2ae1387..9599c104 100755
--- a/test/T650-regexp-query.sh
+++ b/test/T650-regexp-query.sh
@@ -16,7 +16,6 @@ add_message '[from]="-" [subject]="empty from"'
 add_message '[subject]="-"'
 
 test_begin_subtest "null from: search"
-test_subtest_known_broken
 notmuch search 'from:""' | notmuch_search_sanitize > OUTPUT
 cat <<EOF > EXPECTED
 thread:XXX   2001-01-05 [1/1] -; empty from (inbox unread)
@@ -24,7 +23,6 @@ EOF
 test_expect_equal_file EXPECTED OUTPUT
 
 test_begin_subtest "null subject: search"
-test_subtest_known_broken
 notmuch search 'subject:""' | notmuch_search_sanitize > OUTPUT
 cat <<EOF > EXPECTED
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; - (inbox unread)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] lib: handle empty string in regexp field processors
  2017-03-24 22:52 ` [PATCH 2/2] lib: handle empty string in regexp field processors David Bremner
@ 2017-03-25 11:30   ` David Bremner
  0 siblings, 0 replies; 6+ messages in thread
From: David Bremner @ 2017-03-25 11:30 UTC (permalink / raw)
  To: notmuch

David Bremner <david@tethera.net> writes:

> +    if (str.size () == 0)
> +	return Xapian::Query(Xapian::Query::OP_AND_NOT,
> +			     Xapian::Query::MatchAll,
> +			     Xapian::Query (Xapian::Query::OP_WILDCARD, term_prefix));
> +

Full disclosure, this is a pretty expensive query. On an older i7, it
takes about 7.5s (elapsed) on my 466k messages to find 702 messages
without a subject.  I don't think it's a big deal, since I don't think

     notmuch search 'subject:""'

is likely to be typed by mistake.

For comparison, "grep -R '^Subject:$'" (which is not exactly the same
query,  since some messages completely lack a Subject: line).
takes about 390s (elapsed).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: revised foo:"" handling
  2017-03-24 22:52 revised foo:"" handling David Bremner
  2017-03-24 22:52 ` [PATCH 1/2] test: add known broken test for null from: and subject: query David Bremner
  2017-03-24 22:52 ` [PATCH 2/2] lib: handle empty string in regexp field processors David Bremner
@ 2017-03-29 20:07 ` Tomi Ollila
  2017-03-30  0:27   ` David Bremner
  2 siblings, 1 reply; 6+ messages in thread
From: Tomi Ollila @ 2017-03-29 20:07 UTC (permalink / raw)
  To: David Bremner, notmuch

On Sat, Mar 25 2017, David Bremner <david@tethera.net> wrote:

> This obsoletes the first two patches of
>
>      id:20170318030303.17344-1-david@tethera.net
>      
> I think this is a more meaningful interpretation than matching all messages.

These changes look good (AFAIU). tests pass (debian unstable container on
fedora 25 host)

Tomi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: revised foo:"" handling
  2017-03-29 20:07 ` revised foo:"" handling Tomi Ollila
@ 2017-03-30  0:27   ` David Bremner
  0 siblings, 0 replies; 6+ messages in thread
From: David Bremner @ 2017-03-30  0:27 UTC (permalink / raw)
  To: Tomi Ollila, notmuch

Tomi Ollila <tomi.ollila@iki.fi> writes:

> On Sat, Mar 25 2017, David Bremner <david@tethera.net> wrote:
>
>> This obsoletes the first two patches of
>>
>>      id:20170318030303.17344-1-david@tethera.net
>>      
>> I think this is a more meaningful interpretation than matching all messages.
>
> These changes look good (AFAIU). tests pass (debian unstable container on
> fedora 25 host)

I pushed those to release and master

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-03-30  0:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-24 22:52 revised foo:"" handling David Bremner
2017-03-24 22:52 ` [PATCH 1/2] test: add known broken test for null from: and subject: query David Bremner
2017-03-24 22:52 ` [PATCH 2/2] lib: handle empty string in regexp field processors David Bremner
2017-03-25 11:30   ` David Bremner
2017-03-29 20:07 ` revised foo:"" handling Tomi Ollila
2017-03-30  0:27   ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).