From: David Bremner <david@tethera.net>
To: notmuch@notmuchmail.org
Cc: David Bremner <david@tethera.net>
Subject: [PATCH 08/27] lib/parse-sexp: stem unquoted atoms
Date: Fri, 30 Jul 2021 09:55:48 -0300 [thread overview]
Message-ID: <20210730125607.2165433-9-david@tethera.net> (raw)
In-Reply-To: <20210730125607.2165433-1-david@tethera.net>
This is somewhat less DWIM than the Xapian query parser, but it has
the advantage of simplicity.
---
doc/man7/notmuch-sexp-queries.rst | 10 ++++++++--
lib/parse-sexp.cc | 10 +++++++---
test/T081-sexpr-search.sh | 7 +++++--
3 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/doc/man7/notmuch-sexp-queries.rst b/doc/man7/notmuch-sexp-queries.rst
index e530912c..8a3bcd8b 100644
--- a/doc/man7/notmuch-sexp-queries.rst
+++ b/doc/man7/notmuch-sexp-queries.rst
@@ -41,8 +41,10 @@ subqueries.
The empty list matches all messages
*term*
- Match all messages containing *term*, possibly after stemming
- or phase splitting.
+ Match all messages containing *term*, possibly after
+ stemming or phase splitting. For discussion of stemming in
+ notmuch see :any:`notmuch-search-terms(7)`. Stemming only applies
+ to unquoted terms (basic values) in s-expression queries.
``(`` *field* |q1| |q2| ... |qn| ``)``
Restrict the queries |q1| to |qn| to *field*, and combine with *and*
@@ -76,6 +78,10 @@ EXAMPLES
``Wizard``
Match all messages containing the word "wizard", ignoring case.
+``added``
+ Match all messages containing "added", but also those containing "add", "additional",
+ "Additional", "adds", etc... via stemming.
+
.. |q1| replace:: :math:`q_1`
.. |q2| replace:: :math:`q_2`
.. |qn| replace:: :math:`q_n`
diff --git a/lib/parse-sexp.cc b/lib/parse-sexp.cc
index 1ce3c9d4..1be5e209 100644
--- a/lib/parse-sexp.cc
+++ b/lib/parse-sexp.cc
@@ -1,5 +1,4 @@
-#include <xapian.h>
-#include "notmuch-private.h"
+#include "database-private.h"
#include "sexp.h"
#if HAVE_SFSEXP
@@ -17,7 +16,12 @@ _sexp_to_xapian_query (notmuch_database_t *notmuch, const sexp_t *sx,
{
if (sx->ty == SEXP_VALUE) {
- output = Xapian::Query (Xapian::Unicode::tolower (sx->val));
+ std::string term = Xapian::Unicode::tolower (sx->val);
+ Xapian::Stem stem = *(notmuch->stemmer);
+ if (sx->aty == SEXP_BASIC)
+ term = "Z" + stem (term);
+
+ output = Xapian::Query (term);
return NOTMUCH_STATUS_SUCCESS;
}
diff --git a/test/T081-sexpr-search.sh b/test/T081-sexpr-search.sh
index 3ee9f71d..c5c3cf6b 100755
--- a/test/T081-sexpr-search.sh
+++ b/test/T081-sexpr-search.sh
@@ -22,18 +22,21 @@ EOF
test_expect_equal_file EXPECTED OUTPUT
test_begin_subtest "single term in body (case insensitive)"
-notmuch search --query-syntax=sexp 'Wizard' | notmuch_search_sanitize>OUTPUT
+notmuch search --query-syntax=sexp '"Wizard"' | notmuch_search_sanitize>OUTPUT
cat <<EOF > EXPECTED
thread:XXX 2009-11-18 [1/3] Carl Worth| Jan Janak; [notmuch] What a great idea! (inbox unread)
EOF
test_expect_equal_file EXPECTED OUTPUT
test_begin_subtest "single term in body, stemmed version"
-test_subtest_known_broken
notmuch search arriv > EXPECTED
notmuch search --query-syntax=sexp arriv > OUTPUT
test_expect_equal_file EXPECTED OUTPUT
+test_begin_subtest "single term in body, unstemmed version"
+notmuch search --query-syntax=sexp '"arriv"' > OUTPUT
+test_expect_equal_file /dev/null OUTPUT
+
test_begin_subtest "Unbalanced parens"
# A code 1 indicates the error was handled (a crash will return e.g. 139).
test_expect_code 1 "notmuch search --query-syntax=sexp '('"
--
2.30.2
next prev parent reply other threads:[~2021-07-30 12:57 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-30 12:55 v3 sexpr query parser David Bremner
2021-07-30 12:55 ` [PATCH 01/27] configure: optional library sfsexp David Bremner
2021-07-30 12:55 ` [PATCH 02/27] lib: split notmuch_query_create David Bremner
2021-07-30 12:55 ` [PATCH 03/27] lib: define notmuch_query_create_with_syntax David Bremner
2021-07-30 12:55 ` [PATCH 04/27] CLI/search+address: support sexpr queries David Bremner
2021-07-30 12:55 ` [PATCH 05/27] lib: add new status code for query syntax errors David Bremner
2021-07-30 12:55 ` [PATCH 06/27] lib/parse-sexp: parse single terms and the empty list David Bremner
2021-07-30 12:55 ` [PATCH 07/27] lib: leave stemmer object accessible David Bremner
2021-07-30 12:55 ` David Bremner [this message]
2021-07-30 12:55 ` [PATCH 09/27] lib/parse-sexp: support and, not, and or David Bremner
2021-07-30 12:55 ` [PATCH 10/27] lib/parse-sexp: support subject field David Bremner
2021-07-30 12:55 ` [PATCH 11/27] util/unicode: allow calling from C++ David Bremner
2021-07-30 12:55 ` [PATCH 12/27] lib/parse-sexp: support phrase queries David Bremner
2021-07-30 12:55 ` [PATCH 13/27] lib/parse-sexp: add term prefix backed fields David Bremner
2021-07-30 12:55 ` [PATCH 14/27] lib/parse-sexp: 'starts-with' wildcard searches David Bremner
2021-07-30 12:55 ` [PATCH 15/27] lib/parse-sexp: add '*' as syntactic sugar for '(starts-with "")' David Bremner
2021-07-30 12:55 ` [PATCH 16/27] lib/parse-sexp: handle unprefixed terms David Bremner
2021-07-30 12:55 ` [PATCH 17/27] lib/query: generalize exclude handling to s-expression queries David Bremner
2021-07-30 12:55 ` [PATCH 18/27] lib: factor out query construction from regexp David Bremner
2021-07-30 12:55 ` [PATCH 19/27] lib/parse-sexp: support regular expressions David Bremner
2021-07-30 12:56 ` [PATCH 20/27] lib: generate actual Xapian query for "*" and "" David Bremner
2021-07-30 12:56 ` [PATCH 21/27] lib/query: factor out _notmuch_query_string_to_xapian_query David Bremner
2021-07-30 12:56 ` [PATCH 22/27] lib/thread-fp: factor out query expansion, rewrite in Xapian David Bremner
2021-07-30 12:56 ` [PATCH 23/27] lib/parse-sexp: expand queries David Bremner
2021-07-30 12:56 ` [PATCH 24/27] lib/parse-sexp: support infix subqueries David Bremner
2021-07-30 12:56 ` [PATCH 25/27] lib/parse-sexp: parse user headers David Bremner
2021-07-30 12:56 ` [PATCH 26/27] lib: factor out expansion of saved queries David Bremner
2021-07-30 12:56 ` [PATCH 27/27] lib/parse-sexp: handle " David Bremner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210730125607.2165433-9-david@tethera.net \
--to=david@tethera.net \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).