* Parenthesised query breaks query it is embedded in
@ 2022-02-23 15:20 Sean Whitton
2022-02-25 2:41 ` fix for parsing bracketed expression David Bremner
0 siblings, 1 reply; 6+ messages in thread
From: Sean Whitton @ 2022-02-23 15:20 UTC (permalink / raw)
To: notmuch
Hello,
I have this subquery:
subject:("Cron <dak@" and "sudo -u dak-unpriv /srv/")
as one of many queries in a long one like this: (q1) or (q2) or (q3) ..
Including that particular subquery breaks the whole thing: results which
should be included by other disjuncts are not included. I suspect but
have not confirmed that each disjunct after the problematic one stops
matching anything.
The query works fine on its own. And I can work around the problem by
replacing it with
subject:"Cron <dak@" and subject:"sudo -u dak-unpriv /srv/"
Seems like a parsing bug. Thanks.
--
Sean Whitton
^ permalink raw reply [flat|nested] 6+ messages in thread
* fix for parsing bracketed expression
2022-02-23 15:20 Parenthesised query breaks query it is embedded in Sean Whitton
@ 2022-02-25 2:41 ` David Bremner
2022-02-25 2:41 ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner
2022-02-25 2:41 ` [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions David Bremner
0 siblings, 2 replies; 6+ messages in thread
From: David Bremner @ 2022-02-25 2:41 UTC (permalink / raw)
To: Sean Whitton, notmuch
This is not a complete fix, which is hard because of the way we
implement regular expressions. Sean's original examples still won't
work, but hopefully the tests in the second patch show how to make
something similar work. This is probably a good time to mention that
this kind of thing is easier in the sexp query parser.
d
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] test: known broken tests for bracketed terms in subject
2022-02-25 2:41 ` fix for parsing bracketed expression David Bremner
@ 2022-02-25 2:41 ` David Bremner
2022-03-19 10:38 ` David Bremner
2022-02-25 2:41 ` [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions David Bremner
1 sibling, 1 reply; 6+ messages in thread
From: David Bremner @ 2022-02-25 2:41 UTC (permalink / raw)
To: Sean Whitton, notmuch
The heuristics in the field processor currently incorrectly trigger
phrase parsing.
---
test/T650-regexp-query.sh | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/test/T650-regexp-query.sh b/test/T650-regexp-query.sh
index 55dc6c88..4ee6b171 100755
--- a/test/T650-regexp-query.sh
+++ b/test/T650-regexp-query.sh
@@ -65,6 +65,24 @@ thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; - (inbox unread)
EOF
test_expect_equal_file EXPECTED OUTPUT
+test_begin_subtest "bracketed subject search (with dquotes)"
+test_subtest_known_broken
+notmuch search subject:notmuch and subject:show > EXPECTED
+notmuch search 'subject:"(show notmuch)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
+test_begin_subtest "bracketed subject search (with dquotes and operator 'or')"
+test_subtest_known_broken
+notmuch search subject:notmuch or subject:show > EXPECTED
+notmuch search 'subject:"(notmuch or show)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
+test_begin_subtest "bracketed subject search (with dquotes and operator 'and')"
+test_subtest_known_broken
+notmuch search subject:notmuch and subject:show > EXPECTED
+notmuch search 'subject:"(notmuch and show)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
test_begin_subtest "xapian wildcard search for from:"
notmuch search --output=messages 'from:cwo*' > OUTPUT
test_expect_equal_file cworth.msg-ids OUTPUT
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions
2022-02-25 2:41 ` fix for parsing bracketed expression David Bremner
2022-02-25 2:41 ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner
@ 2022-02-25 2:41 ` David Bremner
1 sibling, 0 replies; 6+ messages in thread
From: David Bremner @ 2022-02-25 2:41 UTC (permalink / raw)
To: Sean Whitton, notmuch
Since Xapian does not preserve quotes when passing the subquery to a
field processor, we have to make a guess as to what the user
intended. Here the added assumption is that a string surrounded by
parens is not intended to be a phrase.
---
doc/man7/notmuch-search-terms.rst | 6 ++++--
lib/regexp-fields.cc | 3 ++-
test/T650-regexp-query.sh | 13 ++++++++++---
3 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst
index e80cc7d0..f8ad1edb 100644
--- a/doc/man7/notmuch-search-terms.rst
+++ b/doc/man7/notmuch-search-terms.rst
@@ -275,11 +275,13 @@ the same phrase.
- a.list.of.words
Both parenthesised lists of terms and quoted phrases are ok with
-probabilistic prefixes such as **to:**, **from:**, and **subject:**. In particular
+probabilistic prefixes such as **to:**, **from:**, and **subject:**.
+For prefixes supporting regex search, the parenthesised list should be
+quoted. In particular
::
- subject:(pizza free)
+ subject:"(pizza free)"
is equivalent to
diff --git a/lib/regexp-fields.cc b/lib/regexp-fields.cc
index 7e9d959c..539915d8 100644
--- a/lib/regexp-fields.cc
+++ b/lib/regexp-fields.cc
@@ -227,7 +227,8 @@ RegexpFieldProcessor::operator() (const std::string & str)
* phrase parsing, when possible */
std::string query_str;
- if (*str.rbegin () != '*' || str.find (' ') != std::string::npos)
+ if ((str.at (0) != '(' || *str.rbegin () != ')') &&
+ (*str.rbegin () != '*' || str.find (' ') != std::string::npos))
query_str = '"' + str + '"';
else
query_str = str;
diff --git a/test/T650-regexp-query.sh b/test/T650-regexp-query.sh
index 4ee6b171..a9844501 100755
--- a/test/T650-regexp-query.sh
+++ b/test/T650-regexp-query.sh
@@ -66,23 +66,30 @@ EOF
test_expect_equal_file EXPECTED OUTPUT
test_begin_subtest "bracketed subject search (with dquotes)"
-test_subtest_known_broken
notmuch search subject:notmuch and subject:show > EXPECTED
notmuch search 'subject:"(show notmuch)"' > OUTPUT
test_expect_equal_file_nonempty EXPECTED OUTPUT
test_begin_subtest "bracketed subject search (with dquotes and operator 'or')"
-test_subtest_known_broken
notmuch search subject:notmuch or subject:show > EXPECTED
notmuch search 'subject:"(notmuch or show)"' > OUTPUT
test_expect_equal_file_nonempty EXPECTED OUTPUT
test_begin_subtest "bracketed subject search (with dquotes and operator 'and')"
-test_subtest_known_broken
notmuch search subject:notmuch and subject:show > EXPECTED
notmuch search 'subject:"(notmuch and show)"' > OUTPUT
test_expect_equal_file_nonempty EXPECTED OUTPUT
+test_begin_subtest "bracketed subject search (with phrase, operator 'or')"
+notmuch search 'subject:"mailing list"' or subject:FreeBSD > EXPECTED
+notmuch search 'subject:"(""mailing list"" or FreeBSD)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
+test_begin_subtest "bracketed subject search (with phrase, operator 'and')"
+notmuch search search 'subject:"notmuch show"' and subject:commands > EXPECTED
+notmuch search 'subject:"(""notmuch show"" and commands)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
test_begin_subtest "xapian wildcard search for from:"
notmuch search --output=messages 'from:cwo*' > OUTPUT
test_expect_equal_file cworth.msg-ids OUTPUT
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] test: known broken tests for bracketed terms in subject
2022-02-25 2:41 ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner
@ 2022-03-19 10:38 ` David Bremner
2022-03-20 16:29 ` Sean Whitton
0 siblings, 1 reply; 6+ messages in thread
From: David Bremner @ 2022-03-19 10:38 UTC (permalink / raw)
To: Sean Whitton, notmuch
David Bremner <david@tethera.net> writes:
> The heuristics in the field processor currently incorrectly trigger
> phrase parsing.
I have applied this series to master
d
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] test: known broken tests for bracketed terms in subject
2022-03-19 10:38 ` David Bremner
@ 2022-03-20 16:29 ` Sean Whitton
0 siblings, 0 replies; 6+ messages in thread
From: Sean Whitton @ 2022-03-20 16:29 UTC (permalink / raw)
To: David Bremner, notmuch
Hello,
On Sat 19 Mar 2022 at 07:38am -03, David Bremner wrote:
> David Bremner <david@tethera.net> writes:
>
>> The heuristics in the field processor currently incorrectly trigger
>> phrase parsing.
>
> I have applied this series to master
Nice, thanks!
--
Sean Whitton
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-03-20 16:29 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-23 15:20 Parenthesised query breaks query it is embedded in Sean Whitton
2022-02-25 2:41 ` fix for parsing bracketed expression David Bremner
2022-02-25 2:41 ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner
2022-03-19 10:38 ` David Bremner
2022-03-20 16:29 ` Sean Whitton
2022-02-25 2:41 ` [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions David Bremner
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).