From: David Bremner <david@tethera.net>
To: notmuch@notmuchmail.org
Cc: David Bremner <david@tethera.net>
Subject: [PATCH 14/36] lib/parse-sexp: add term prefix backed fields
Date: Tue, 24 Aug 2021 08:17:23 -0700 [thread overview]
Message-ID: <20210824151745.2941868-15-david@tethera.net> (raw)
In-Reply-To: <20210824151745.2941868-1-david@tethera.net>
We use "boolean" to describe fields that should generate terms
literally without stemming or phrase splitting. This terminology
might not be ideal but it is already enshrined in
notmuch-search-terms(7).
---
doc/man7/notmuch-sexp-queries.rst | 18 +++++-
lib/parse-sexp.cc | 49 ++++++++++++++++
test/T081-sexpr-search.sh | 94 +++++++++++++++++++++++++++++++
3 files changed, 160 insertions(+), 1 deletion(-)
diff --git a/doc/man7/notmuch-sexp-queries.rst b/doc/man7/notmuch-sexp-queries.rst
index b763876d..6e68fcc3 100644
--- a/doc/man7/notmuch-sexp-queries.rst
+++ b/doc/man7/notmuch-sexp-queries.rst
@@ -81,6 +81,14 @@ string) into words, ignore punctuation. Phrase splitting is applied to
terms in phrase (probabilistic) fields. Both phrase splitting and
stemming apply only in phrase fields.
+Each term or phrase field has an associated combining operator
+(``and`` or ``or``) used to combine the queries from each element of
+the tail of the list. This is generally ``or`` for those fields where
+a message has one such attribute, and ``and`` otherwise.
+
+Term or phrase fields can contain arbitrarily complex queries made up
+from terms, operators, and modifiers, but not other fields.
+
.. _field-table:
.. table:: Fields with supported modifiers
@@ -112,7 +120,7 @@ stemming apply only in phrase fields.
+------------+-----------+-----------+-----------+-----------+----------+
| mimetype | or | phrase | yes | yes | no |
+------------+-----------+-----------+-----------+-----------+----------+
- | path | or | term | yes | yes | yes |
+ | path | or | term | no | yes | yes |
+------------+-----------+-----------+-----------+-----------+----------+
| property | and | term | yes | yes | yes |
+------------+-----------+-----------+-----------+-----------+----------+
@@ -151,10 +159,18 @@ EXAMPLES
Match the *phrase* "quick" followed by "fox" in phrase fields (or
outside a field). Match the literal string in a term field.
+``(id 1234@invalid blah@test)``
+ Matches Message-Id "1234@invalid" *or* Message-Id "blah@test"
+
``(subject quick "brown fox")``
Match messages whose subject contains "quick" (anywhere, stemmed) and
the phrase "brown fox".
+``(to (or bob@example.com mallory@example.org))`` ``(or (to bob@example.com) (to mallory@example.org))``
+ Match in the "To" or "Cc" headers, "bob@example.com",
+ "mallory@example.org", and also "bob@example.com.au" since it
+ contains the adjacent triple "bob", "example", "com".
+
NOTES
=====
diff --git a/lib/parse-sexp.cc b/lib/parse-sexp.cc
index 0917f505..26b7e5f1 100644
--- a/lib/parse-sexp.cc
+++ b/lib/parse-sexp.cc
@@ -10,8 +10,26 @@
typedef enum {
SEXP_FLAG_NONE = 0,
SEXP_FLAG_FIELD = 1 << 0,
+ SEXP_FLAG_BOOLEAN = 1 << 1,
} _sexp_flag_t;
+/*
+ * define bitwise operators to hide casts */
+
+inline _sexp_flag_t
+operator| (_sexp_flag_t a, _sexp_flag_t b)
+{
+ return static_cast<_sexp_flag_t>(
+ static_cast<unsigned>(a) | static_cast<unsigned>(b));
+}
+
+inline _sexp_flag_t
+operator& (_sexp_flag_t a, _sexp_flag_t b)
+{
+ return static_cast<_sexp_flag_t>(
+ static_cast<unsigned>(a) & static_cast<unsigned>(b));
+}
+
typedef struct {
const char *name;
Xapian::Query::op xapian_op;
@@ -23,12 +41,39 @@ static _sexp_prefix_t prefixes[] =
{
{ "and", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
SEXP_FLAG_NONE },
+ { "attachment", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
+ SEXP_FLAG_FIELD },
+ { "body", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
+ SEXP_FLAG_FIELD },
+ { "from", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
+ SEXP_FLAG_FIELD },
+ { "folder", Xapian::Query::OP_OR, Xapian::Query::MatchNothing,
+ SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN },
+ { "id", Xapian::Query::OP_OR, Xapian::Query::MatchNothing,
+ SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN },
+ { "is", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
+ SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN },
+ { "mid", Xapian::Query::OP_OR, Xapian::Query::MatchNothing,
+ SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN },
+ { "mimetype", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
+ SEXP_FLAG_FIELD },
{ "not", Xapian::Query::OP_AND_NOT, Xapian::Query::MatchAll,
SEXP_FLAG_NONE },
{ "or", Xapian::Query::OP_OR, Xapian::Query::MatchNothing,
SEXP_FLAG_NONE },
+ { "path", Xapian::Query::OP_OR, Xapian::Query::MatchNothing,
+ SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN },
+ { "property", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
+ SEXP_FLAG_FIELD
+ | SEXP_FLAG_BOOLEAN },
{ "subject", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
SEXP_FLAG_FIELD },
+ { "tag", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
+ SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN },
+ { "thread", Xapian::Query::OP_OR, Xapian::Query::MatchNothing,
+ SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN },
+ { "to", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
+ SEXP_FLAG_FIELD },
{ }
};
@@ -110,6 +155,10 @@ _sexp_to_xapian_query (notmuch_database_t *notmuch, const _sexp_prefix_t *parent
std::string term = Xapian::Unicode::tolower (sx->val);
Xapian::Stem stem = *(notmuch->stemmer);
std::string term_prefix = parent ? _find_prefix (parent->name) : "";
+ if (parent && (parent->flags & SEXP_FLAG_BOOLEAN)) {
+ output = Xapian::Query (term_prefix + sx->val);
+ return NOTMUCH_STATUS_SUCCESS;
+ }
if (sx->aty == SEXP_BASIC && unicode_word_utf8 (sx->val)) {
output = Xapian::Query ("Z" + term_prefix + stem (term));
return NOTMUCH_STATUS_SUCCESS;
diff --git a/test/T081-sexpr-search.sh b/test/T081-sexpr-search.sh
index 4a051a50..96d58ee2 100755
--- a/test/T081-sexpr-search.sh
+++ b/test/T081-sexpr-search.sh
@@ -101,6 +101,99 @@ thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)
EOF
test_expect_equal_file EXPECTED OUTPUT
+test_begin_subtest "Search by 'attachment'"
+notmuch search attachment:notmuch-help.patch > EXPECTED
+notmuch search --query=sexp '(attachment notmuch-help.patch)' > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest "Search by 'body'"
+add_message '[subject]="body search"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"' [body]=bodysearchtest
+output=$(notmuch search --query=sexp '(body bodysearchtest)' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; body search (inbox unread)"
+
+test_begin_subtest "Search by 'body' (phrase)"
+add_message '[subject]="body search (phrase)"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"' '[body]="body search (phrase)"'
+add_message '[subject]="negative result"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"' '[body]="This phrase should not match the body search"'
+output=$(notmuch search --query=sexp '(body "body search phrase")' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; body search (phrase) (inbox unread)"
+
+test_begin_subtest "Search by 'body' (utf-8):"
+add_message '[subject]="utf8-message-body-subject"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"' '[body]="message body utf8: bödý"'
+output=$(notmuch search --query=sexp '(body bödý)' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-message-body-subject (inbox unread)"
+
+test_begin_subtest "Search by 'from'"
+add_message '[subject]="search by from"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"' [from]=searchbyfrom
+output=$(notmuch search --query=sexp '(from searchbyfrom)' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] searchbyfrom; search by from (inbox unread)"
+
+test_begin_subtest "Search by 'from' (address)"
+add_message '[subject]="search by from (address)"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"' [from]=searchbyfrom@example.com
+output=$(notmuch search --query=sexp '(from searchbyfrom@example.com)' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] searchbyfrom@example.com; search by from (address) (inbox unread)"
+
+test_begin_subtest "Search by 'from' (name)"
+add_message '[subject]="search by from (name)"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"' '[from]="Search By From Name <test@example.com>"'
+output=$(notmuch search --query=sexp '(from "Search By From Name")' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Search By From Name; search by from (name) (inbox unread)"
+
+test_begin_subtest "Search by 'from' (name and address)"
+output=$(notmuch search --query=sexp '(from "Search By From Name <test@example.com>")' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Search By From Name; search by from (name) (inbox unread)"
+
+add_message '[dir]=bad' '[subject]="To the bone"'
+add_message '[dir]=.' '[subject]="Top level"'
+add_message '[dir]=bad/news' '[subject]="Bears"'
+mkdir -p "${MAIL_DIR}/duplicate/bad/news"
+cp "$gen_msg_filename" "${MAIL_DIR}/duplicate/bad/news"
+
+add_message '[dir]=things' '[subject]="These are a few"'
+add_message '[dir]=things/favorite' '[subject]="Raindrops, whiskers, kettles"'
+add_message '[dir]=things/bad' '[subject]="Bites, stings, sad feelings"'
+
+test_begin_subtest "Search by 'folder' (multiple)"
+output=$(notmuch search --query=sexp '(folder bad bad/news things/bad)' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; To the bone (inbox unread)
+thread:XXX 2001-01-05 [1/1(2)] Notmuch Test Suite; Bears (inbox unread)
+thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; Bites, stings, sad feelings (inbox unread)"
+
+test_begin_subtest "Search by 'folder': top level."
+notmuch search folder:'""' > EXPECTED
+notmuch search --query=sexp '(folder "")' > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest "Search by 'id'"
+add_message '[subject]="search by id"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"'
+output=$(notmuch search --query=sexp "(id ${gen_msg_id})" | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; search by id (inbox unread)"
+
+test_begin_subtest "Search by 'id' (or)"
+add_message '[subject]="search by id"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"'
+output=$(notmuch search --query=sexp "(id non-existent-mid ${gen_msg_id})" | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; search by id (inbox unread)"
+
+test_begin_subtest "Search by 'is' (multiple)"
+notmuch tag -inbox tag:searchbytag
+notmuch search is:inbox AND is:unread | notmuch_search_sanitize > EXPECTED
+notmuch search --query=sexp '(is inbox unread)' | notmuch_search_sanitize > OUTPUT
+notmuch tag +inbox tag:searchbytag
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest "Search by 'mid'"
+add_message '[subject]="search by mid"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"'
+output=$(notmuch search --query=sexp "(mid ${gen_msg_id})" | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; search by mid (inbox unread)"
+
+test_begin_subtest "Search by 'mid' (or)"
+add_message '[subject]="search by mid"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"'
+output=$(notmuch search --query=sexp "(mid non-existent-mid ${gen_msg_id})" | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; search by mid (inbox unread)"
+
+test_begin_subtest "Search by 'mimetype'"
+notmuch search mimetype:text/html > EXPECTED
+notmuch search --query=sexp '(mimetype text html)' > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
test_begin_subtest "Search by 'subject' (utf-8, phrase-token):"
output=$(notmuch search --query=sexp '(subject utf8-sübjéct)' | notmuch_search_sanitize)
test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)"
@@ -118,6 +211,7 @@ notmuch search --query=sexp '(subject (or utf8 "compatibility issues"))' | notmu
cat <<EOF > EXPECTED
thread:XXX 2009-11-18 [4/4] Jjgod Jiang, Alexander Botero-Lowry; [notmuch] Mac OS X/Darwin compatibility issues (inbox unread)
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)
+thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-message-body-subject (inbox unread)
EOF
test_expect_equal_file EXPECTED OUTPUT
--
2.32.0\r
next prev parent reply other threads:[~2021-08-24 15:23 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-24 15:17 v5 sexp query parser David Bremner
2021-08-24 15:17 ` [PATCH 01/36] CLI: make variable n_requested_db_uuid file scope David Bremner
2021-08-24 15:17 ` [PATCH 02/36] configure: optional library sfsexp David Bremner
2021-08-24 15:17 ` [PATCH 03/36] lib: split notmuch_query_create David Bremner
2021-08-24 15:17 ` [PATCH 04/36] lib: define notmuch_query_create_with_syntax David Bremner
2021-08-24 15:17 ` [PATCH 05/36] CLI/search+address: support sexpr queries David Bremner
2021-08-24 15:17 ` [PATCH 06/36] lib: add new status code for query syntax errors David Bremner
2021-08-24 15:17 ` [PATCH 07/36] lib/parse-sexp: parse single terms and the empty list David Bremner
2021-08-24 15:17 ` [PATCH 08/36] lib: leave stemmer object accessible David Bremner
2021-08-24 15:17 ` [PATCH 09/36] lib/parse-sexp: stem unquoted atoms David Bremner
2021-08-24 15:17 ` [PATCH 10/36] lib/parse-sexp: support and, not, and or David Bremner
2021-08-24 15:17 ` [PATCH 11/36] lib/parse-sexp: support subject field David Bremner
2021-08-24 15:17 ` [PATCH 12/36] util/unicode: allow calling from C++ David Bremner
2021-08-24 15:17 ` [PATCH 13/36] lib/parse-sexp: support phrase queries David Bremner
2021-08-24 15:17 ` David Bremner [this message]
2021-08-24 15:17 ` [PATCH 15/36] lib/parse-sexp: 'starts-with' wildcard searches David Bremner
2021-08-24 15:17 ` [PATCH 16/36] lib/parse-sexp: add '*' as syntactic sugar for '(starts-with "")' David Bremner
2021-08-24 15:17 ` [PATCH 17/36] lib/parse-sexp: handle unprefixed terms David Bremner
2021-08-24 15:17 ` [PATCH 18/36] lib/query: generalize exclude handling to s-expression queries David Bremner
2021-08-24 15:17 ` [PATCH 19/36] lib: factor out query construction from regexp David Bremner
2021-08-24 15:17 ` [PATCH 20/36] lib/parse-sexp: support regular expressions David Bremner
2021-08-24 15:17 ` [PATCH 21/36] lib: generate actual Xapian query for "*" and "" David Bremner
2021-08-24 15:17 ` [PATCH 22/36] lib/query: factor out _notmuch_query_string_to_xapian_query David Bremner
2021-08-24 15:17 ` [PATCH 23/36] lib/thread-fp: factor out query expansion, rewrite in Xapian David Bremner
2021-08-24 15:17 ` [PATCH 24/36] lib/parse-sexp: expand queries David Bremner
2021-08-24 15:17 ` [PATCH 25/36] lib/parse-sexp: support infix subqueries David Bremner
2021-08-24 15:17 ` [PATCH 26/36] lib/parse-sexp: parse user headers David Bremner
2021-08-24 15:17 ` [PATCH 27/36] lib: factor out expansion of saved queries David Bremner
2021-08-24 15:17 ` [PATCH 28/36] lib/parse-sexp: handle " David Bremner
2021-08-24 15:17 ` [PATCH 29/36] CLI/config support saving s-expression queries David Bremner
2021-08-24 15:17 ` [PATCH 30/36] lib/parse-sexp: support saved " David Bremner
2021-08-24 15:17 ` [PATCH 31/36] lib/parse-sexp: thread environment argument through parser David Bremner
2021-08-24 15:17 ` [PATCH 32/36] lib/parse-sexp: apply macros David Bremner
2021-08-24 15:17 ` [PATCH 33/36] CLI: move query syntax to shared option David Bremner
2021-08-24 15:17 ` [PATCH 34/36] CLI/{count, dump, reindex, reply, show}: enable sexp queries David Bremner
2021-08-24 15:17 ` [PATCH 35/36] CLI/tag: " David Bremner
2021-08-24 15:17 ` [PATCH 36/36] doc/sexp-queries: update synopsis and description David Bremner
2021-09-05 19:31 ` v5 sexp query parser David Bremner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210824151745.2941868-15-david@tethera.net \
--to=david@tethera.net \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).