From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 3425E6DE0EC5 for ; Sun, 3 Mar 2019 18:29:21 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.01 X-Spam-Level: X-Spam-Status: No, score=-0.01 tagged_above=-999 required=5 tests=[AWL=-0.009, SPF_PASS=-0.001] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PPkIsGuApzAi for ; Sun, 3 Mar 2019 18:29:20 -0800 (PST) Received: from fethera.tethera.net (fethera.tethera.net [198.245.60.197]) by arlo.cworth.org (Postfix) with ESMTPS id E3E6A6DE0C3D for ; Sun, 3 Mar 2019 18:29:19 -0800 (PST) Received: from remotemail by fethera.tethera.net with local (Exim 4.89) (envelope-from ) id 1h0dLy-0002Za-Le; Sun, 03 Mar 2019 21:29:18 -0500 Received: (nullmailer pid 13968 invoked by uid 1000); Mon, 04 Mar 2019 02:29:16 -0000 From: David Bremner To: David Bremner , notmuch@notmuchmail.org Subject: [PATCH] lib: add 'body:' field, stop indexing headers twice. Date: Sun, 3 Mar 2019 22:29:12 -0400 Message-Id: <20190304022912.13924-1-david@tethera.net> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218115622.31466-1-david@tethera.net> References: <20190218115622.31466-1-david@tethera.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2019 02:29:21 -0000 The new `body:` field (in Xapian terms) or prefix (in slightly sloppier notmuch) terms allows matching terms that occur only in the body. Unprefixed query terms should continue to match anywhere (header or body) in the message. This follows a suggestion of Olly Betts to use the facility (since Xapian 1.0.4) to add the same field with multiple prefixes. The double indexing of previous versions is thus replaced with a query time expension of unprefixed query terms to the various prefixed equivalent. Reindexing will be needed for negated 'body:' searches to work correctly. --- doc/man7/notmuch-search-terms.rst | 5 +++- lib/database.cc | 6 +++++ lib/message.cc | 10 +++---- test/T730-body.sh | 43 +++++++++++++++++++++++++++++++ 4 files changed, 58 insertions(+), 6 deletions(-) create mode 100755 test/T730-body.sh diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst index f7a39ceb..fd8bf634 100644 --- a/doc/man7/notmuch-search-terms.rst +++ b/doc/man7/notmuch-search-terms.rst @@ -44,6 +44,9 @@ results to those whose value matches a regular expression (see notmuch search 'from:"/bob@.*[.]example[.]com/"' +body: + Match terms in the body of messages. + from: or from:// The **from:** prefix is used to match the name or address of the sender of an email message. @@ -249,7 +252,7 @@ follows. Boolean **tag:**, **id:**, **thread:**, **folder:**, **path:**, **property:** Probabilistic - **to:**, **attachment:**, **mimetype:** + **body:**, **to:**, **attachment:**, **mimetype:** Special **from:**, **query:**, **subject:** diff --git a/lib/database.cc b/lib/database.cc index 9cf8062c..27c2d042 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -259,6 +259,8 @@ prefix_t prefix_table[] = { { "directory", "XDIRECTORY", NOTMUCH_FIELD_NO_FLAGS }, { "file-direntry", "XFDIRENTRY", NOTMUCH_FIELD_NO_FLAGS }, { "directory-direntry", "XDDIRENTRY", NOTMUCH_FIELD_NO_FLAGS }, + { "body", "", NOTMUCH_FIELD_EXTERNAL | + NOTMUCH_FIELD_PROBABILISTIC}, { "thread", "G", NOTMUCH_FIELD_EXTERNAL | NOTMUCH_FIELD_PROCESSOR }, { "tag", "K", NOTMUCH_FIELD_EXTERNAL | @@ -302,6 +304,8 @@ prefix_t prefix_table[] = { static void _setup_query_field_default (const prefix_t *prefix, notmuch_database_t *notmuch) { + if (prefix->prefix) + notmuch->query_parser->add_prefix("",prefix->prefix); if (prefix->flags & NOTMUCH_FIELD_PROBABILISTIC) notmuch->query_parser->add_prefix (prefix->name, prefix->prefix); else @@ -326,6 +330,8 @@ _setup_query_field (const prefix_t *prefix, notmuch_database_t *notmuch) *notmuch->query_parser, notmuch))->release (); /* we treat all field-processor fields as boolean in order to get the raw input */ + if (prefix->prefix) + notmuch->query_parser->add_prefix("",prefix->prefix); notmuch->query_parser->add_boolean_prefix (prefix->name, fp); } else { _setup_query_field_default (prefix, notmuch); diff --git a/lib/message.cc b/lib/message.cc index 6f2f6345..64349f83 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -1443,13 +1443,13 @@ _notmuch_message_gen_terms (notmuch_message_t *message, message->termpos = term_gen->get_termpos () + 100; _notmuch_message_invalidate_metadata (message, prefix_name); + } else { + term_gen->set_termpos (message->termpos); + term_gen->index_text (text); + /* Create a term gap, as above. */ + message->termpos = term_gen->get_termpos () + 100; } - term_gen->set_termpos (message->termpos); - term_gen->index_text (text); - /* Create a term gap, as above. */ - message->termpos = term_gen->get_termpos () + 100; - return NOTMUCH_PRIVATE_STATUS_SUCCESS; } diff --git a/test/T730-body.sh b/test/T730-body.sh new file mode 100755 index 00000000..548b30a4 --- /dev/null +++ b/test/T730-body.sh @@ -0,0 +1,43 @@ +#!/usr/bin/env bash +test_description='search body' +. $(dirname "$0")/test-lib.sh || exit 1 + +add_message "[body]=thebody-1" "[subject]=subject-1" +add_message "[body]=nothing-to-see-here-1" "[subject]=thebody-1" + +test_begin_subtest 'search with body: prefix' +notmuch search body:thebody | notmuch_search_sanitize > OUTPUT +cat < EXPECTED +thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; subject-1 (inbox unread) +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest 'search without body: prefix' +notmuch search thebody | notmuch_search_sanitize > OUTPUT +cat < EXPECTED +thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; subject-1 (inbox unread) +thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; thebody-1 (inbox unread) +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest 'negated body: prefix' +notmuch search thebody and not body:thebody | notmuch_search_sanitize > OUTPUT +cat < EXPECTED +thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; thebody-1 (inbox unread) +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest 'search unprefixed for prefixed term' +notmuch search subject | notmuch_search_sanitize > OUTPUT +cat < EXPECTED +thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; subject-1 (inbox unread) +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_begin_subtest 'search with body: prefix for term only in subject' +notmuch search body:subject | notmuch_search_sanitize > OUTPUT +cat < EXPECTED +EOF +test_expect_equal_file EXPECTED OUTPUT + +test_done -- 2.20.1