From: David Bremner <david@tethera.net>
To: notmuch@notmuchmail.org
Cc: David Bremner <david@tethera.net>
Subject: [PATCH 11/36] lib/parse-sexp: support subject field
Date: Tue, 24 Aug 2021 08:17:20 -0700 [thread overview]
Message-ID: <20210824151745.2941868-12-david@tethera.net> (raw)
In-Reply-To: <20210824151745.2941868-1-david@tethera.net>
The broken tests are because we do not yet handle phrase searches.
---
doc/man7/notmuch-sexp-queries.rst | 62 +++++++++++++++++++++++++++++--
lib/parse-sexp.cc | 19 +++++++++-
test/T081-sexpr-search.sh | 57 ++++++++++++++++++++++++++++
3 files changed, 133 insertions(+), 5 deletions(-)
diff --git a/doc/man7/notmuch-sexp-queries.rst b/doc/man7/notmuch-sexp-queries.rst
index 0304759e..08e97cc3 100644
--- a/doc/man7/notmuch-sexp-queries.rst
+++ b/doc/man7/notmuch-sexp-queries.rst
@@ -36,9 +36,8 @@ An s-expression query is either an atom, the empty list, or a
a *field*, *logical operation*, or *modifier*, and 0 or more
subqueries.
-``*``
-``()``
- The empty list matches all messages
+``*`` ``()``
+ Match all messages.
*term*
Match all messages containing *term*, possibly after
@@ -64,6 +63,59 @@ subqueries.
FIELDS
``````
+*Fields* (also called *prefixes* in notmuch documentation)
+correspond to attributes of mail messages. Some are inherent (and
+immutable) like ``subject``, while others ``tag`` and ``property`` are
+settable by the user. Each concrete field in
+:any:`the table below <field-table>`
+is discussed further under "Search prefixes" in
+:any:`notmuch-search-terms(7)`. The row *user* refers to user defined
+fields, described in :any:`notmuch-config(1)`.
+
+.. _field-table:
+
+.. table:: Fields with supported modifiers
+
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | field | combine | type | expand | wildcard | regex |
+ +============+===========+===========+===========+===========+==========+
+ | *none* | and | | no | yes | no |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | *user* | and | phrase | no | yes | no |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | attachment | and | phrase | yes | yes | no |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | body | and | phrase | no | no | no |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | date | | range | no | no | no |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | folder | or | phrase | yes | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | from | and | phrase | yes | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | id | or | term | no | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | is | and | term | yes | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | lastmod | | range | no | no | no |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | mid | or | term | no | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | mimetype | or | phrase | yes | yes | no |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | path | or | term | yes | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | property | and | term | yes | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | subject | and | phrase | yes | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | tag | and | term | yes | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | thread | or | term | yes | yes | yes |
+ +------------+-----------+-----------+-----------+-----------+----------+
+ | to | and | phrase | yes | yes | no |
+ +------------+-----------+-----------+-----------+-----------+----------+
+
.. _modifiers:
MODIFIERS
@@ -86,6 +138,10 @@ EXAMPLES
``(not Bob Marley)``
Match messages containing neither "Bob" nor "Marley", nor their stems,
+``(subject quick "brown fox")``
+ Match messages whose subject contains "quick" (anywhere, stemmed) and
+ the phrase "brown fox".
+
.. |q1| replace:: :math:`q_1`
.. |q2| replace:: :math:`q_2`
.. |qn| replace:: :math:`q_n`
diff --git a/lib/parse-sexp.cc b/lib/parse-sexp.cc
index 0d2c0ba8..25556058 100644
--- a/lib/parse-sexp.cc
+++ b/lib/parse-sexp.cc
@@ -8,7 +8,8 @@
* definitions from sexp.h */
typedef enum {
- SEXP_FLAG_NONE = 0,
+ SEXP_FLAG_NONE = 0,
+ SEXP_FLAG_FIELD = 1 << 0,
} _sexp_flag_t;
typedef struct {
@@ -26,6 +27,8 @@ static _sexp_prefix_t prefixes[] =
SEXP_FLAG_NONE },
{ "or", Xapian::Query::OP_OR, Xapian::Query::MatchNothing,
SEXP_FLAG_NONE },
+ { "subject", Xapian::Query::OP_AND, Xapian::Query::MatchAll,
+ SEXP_FLAG_FIELD },
{ }
};
@@ -76,8 +79,11 @@ _sexp_to_xapian_query (notmuch_database_t *notmuch, const _sexp_prefix_t *parent
if (sx->ty == SEXP_VALUE) {
std::string term = Xapian::Unicode::tolower (sx->val);
Xapian::Stem stem = *(notmuch->stemmer);
+ std::string term_prefix = parent ? _find_prefix (parent->name) : "";
if (sx->aty == SEXP_BASIC)
- term = "Z" + stem (term);
+ term = "Z" + term_prefix + stem (term);
+ else
+ term = term_prefix + term;
output = Xapian::Query (term);
return NOTMUCH_STATUS_SUCCESS;
@@ -97,6 +103,15 @@ _sexp_to_xapian_query (notmuch_database_t *notmuch, const _sexp_prefix_t *parent
for (_sexp_prefix_t *prefix = prefixes; prefix && prefix->name; prefix++) {
if (strcmp (prefix->name, sx->list->val) == 0) {
+ if (prefix->flags & SEXP_FLAG_FIELD) {
+ if (parent) {
+ _notmuch_database_log (notmuch, "nested field: '%s' inside '%s'\n",
+ prefix->name, parent->name);
+ return NOTMUCH_STATUS_BAD_QUERY_SYNTAX;
+ }
+ parent = prefix;
+ }
+
return _sexp_combine_query (notmuch, parent, prefix->xapian_op, prefix->initial,
sx->list->next, output);
}
diff --git a/test/T081-sexpr-search.sh b/test/T081-sexpr-search.sh
index 5e1bb18d..90cef50c 100755
--- a/test/T081-sexpr-search.sh
+++ b/test/T081-sexpr-search.sh
@@ -62,6 +62,55 @@ test_begin_subtest "single term in body, unstemmed version"
notmuch search --query=sexp '"arriv"' > OUTPUT
test_expect_equal_file /dev/null OUTPUT
+test_begin_subtest "Search by 'subject'"
+add_message [subject]=subjectsearchtest '[date]="Sat, 01 Jan 2000 12:00:00 -0000"'
+output=$(notmuch search --query=sexp '(subject subjectsearchtest)' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; subjectsearchtest (inbox unread)"
+
+test_begin_subtest "Search by 'subject' (case insensitive)"
+notmuch search tag:inbox and subject:maildir | notmuch_search_sanitize > EXPECTED
+notmuch search --query=sexp '(subject "Maildir")' | notmuch_search_sanitize > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest "Search by 'subject' (utf-8):"
+add_message [subject]=utf8-sübjéct '[date]="Sat, 01 Jan 2000 12:00:00 -0000"'
+output=$(notmuch search --query=sexp '(subject utf8 sübjéct)' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)"
+
+test_begin_subtest "Search by 'subject' (utf-8, and):"
+output=$(notmuch search --query=sexp '(subject (and utf8 sübjéct))' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)"
+
+test_begin_subtest "Search by 'subject' (utf-8, and outside):"
+output=$(notmuch search --query=sexp '(and (subject utf8) (subject sübjéct))' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)"
+
+test_begin_subtest "Search by 'subject' (utf-8, or):"
+notmuch search --query=sexp '(subject (or utf8 subjectsearchtest))' | notmuch_search_sanitize > OUTPUT
+cat <<EOF > EXPECTED
+thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; subjectsearchtest (inbox unread)
+thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest "Search by 'subject' (utf-8, or outside):"
+notmuch search --query=sexp '(or (subject utf8) (subject subjectsearchtest))' | notmuch_search_sanitize > OUTPUT
+cat <<EOF > EXPECTED
+thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; subjectsearchtest (inbox unread)
+thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest "Search by 'subject' (utf-8, phrase-token):"
+test_subtest_known_broken
+output=$(notmuch search --query=sexp '(subject utf8-sübjéct)' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)"
+
+test_begin_subtest "Search by 'subject' (utf-8, quoted string):"
+test_subtest_known_broken
+output=$(notmuch search --query=sexp '(subject "utf8 sübjéct")' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; utf8-sübjéct (inbox unread)"
+
test_begin_subtest "Unbalanced parens"
# A code 1 indicates the error was handled (a crash will return e.g. 139).
test_expect_code 1 "notmuch search --query=sexp '('"
@@ -90,4 +139,12 @@ unexpected list in field/operation position
EOF
test_expect_equal_file EXPECTED OUTPUT
+test_begin_subtest "illegal nesting"
+notmuch search --query=sexp '(subject (subject foo))' >OUTPUT 2>&1
+cat <<EOF > EXPECTED
+notmuch search: Syntax error in query
+nested field: 'subject' inside 'subject'
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
test_done
--
2.32.0\r
next prev parent reply other threads:[~2021-08-24 15:22 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-24 15:17 v5 sexp query parser David Bremner
2021-08-24 15:17 ` [PATCH 01/36] CLI: make variable n_requested_db_uuid file scope David Bremner
2021-08-24 15:17 ` [PATCH 02/36] configure: optional library sfsexp David Bremner
2021-08-24 15:17 ` [PATCH 03/36] lib: split notmuch_query_create David Bremner
2021-08-24 15:17 ` [PATCH 04/36] lib: define notmuch_query_create_with_syntax David Bremner
2021-08-24 15:17 ` [PATCH 05/36] CLI/search+address: support sexpr queries David Bremner
2021-08-24 15:17 ` [PATCH 06/36] lib: add new status code for query syntax errors David Bremner
2021-08-24 15:17 ` [PATCH 07/36] lib/parse-sexp: parse single terms and the empty list David Bremner
2021-08-24 15:17 ` [PATCH 08/36] lib: leave stemmer object accessible David Bremner
2021-08-24 15:17 ` [PATCH 09/36] lib/parse-sexp: stem unquoted atoms David Bremner
2021-08-24 15:17 ` [PATCH 10/36] lib/parse-sexp: support and, not, and or David Bremner
2021-08-24 15:17 ` David Bremner [this message]
2021-08-24 15:17 ` [PATCH 12/36] util/unicode: allow calling from C++ David Bremner
2021-08-24 15:17 ` [PATCH 13/36] lib/parse-sexp: support phrase queries David Bremner
2021-08-24 15:17 ` [PATCH 14/36] lib/parse-sexp: add term prefix backed fields David Bremner
2021-08-24 15:17 ` [PATCH 15/36] lib/parse-sexp: 'starts-with' wildcard searches David Bremner
2021-08-24 15:17 ` [PATCH 16/36] lib/parse-sexp: add '*' as syntactic sugar for '(starts-with "")' David Bremner
2021-08-24 15:17 ` [PATCH 17/36] lib/parse-sexp: handle unprefixed terms David Bremner
2021-08-24 15:17 ` [PATCH 18/36] lib/query: generalize exclude handling to s-expression queries David Bremner
2021-08-24 15:17 ` [PATCH 19/36] lib: factor out query construction from regexp David Bremner
2021-08-24 15:17 ` [PATCH 20/36] lib/parse-sexp: support regular expressions David Bremner
2021-08-24 15:17 ` [PATCH 21/36] lib: generate actual Xapian query for "*" and "" David Bremner
2021-08-24 15:17 ` [PATCH 22/36] lib/query: factor out _notmuch_query_string_to_xapian_query David Bremner
2021-08-24 15:17 ` [PATCH 23/36] lib/thread-fp: factor out query expansion, rewrite in Xapian David Bremner
2021-08-24 15:17 ` [PATCH 24/36] lib/parse-sexp: expand queries David Bremner
2021-08-24 15:17 ` [PATCH 25/36] lib/parse-sexp: support infix subqueries David Bremner
2021-08-24 15:17 ` [PATCH 26/36] lib/parse-sexp: parse user headers David Bremner
2021-08-24 15:17 ` [PATCH 27/36] lib: factor out expansion of saved queries David Bremner
2021-08-24 15:17 ` [PATCH 28/36] lib/parse-sexp: handle " David Bremner
2021-08-24 15:17 ` [PATCH 29/36] CLI/config support saving s-expression queries David Bremner
2021-08-24 15:17 ` [PATCH 30/36] lib/parse-sexp: support saved " David Bremner
2021-08-24 15:17 ` [PATCH 31/36] lib/parse-sexp: thread environment argument through parser David Bremner
2021-08-24 15:17 ` [PATCH 32/36] lib/parse-sexp: apply macros David Bremner
2021-08-24 15:17 ` [PATCH 33/36] CLI: move query syntax to shared option David Bremner
2021-08-24 15:17 ` [PATCH 34/36] CLI/{count, dump, reindex, reply, show}: enable sexp queries David Bremner
2021-08-24 15:17 ` [PATCH 35/36] CLI/tag: " David Bremner
2021-08-24 15:17 ` [PATCH 36/36] doc/sexp-queries: update synopsis and description David Bremner
2021-09-05 19:31 ` v5 sexp query parser David Bremner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210824151745.2941868-12-david@tethera.net \
--to=david@tethera.net \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).