* Parenthesised query breaks query it is embedded in @ 2022-02-23 15:20 Sean Whitton 2022-02-25 2:41 ` fix for parsing bracketed expression David Bremner 0 siblings, 1 reply; 6+ messages in thread From: Sean Whitton @ 2022-02-23 15:20 UTC (permalink / raw) To: notmuch Hello, I have this subquery: subject:("Cron <dak@" and "sudo -u dak-unpriv /srv/") as one of many queries in a long one like this: (q1) or (q2) or (q3) .. Including that particular subquery breaks the whole thing: results which should be included by other disjuncts are not included. I suspect but have not confirmed that each disjunct after the problematic one stops matching anything. The query works fine on its own. And I can work around the problem by replacing it with subject:"Cron <dak@" and subject:"sudo -u dak-unpriv /srv/" Seems like a parsing bug. Thanks. -- Sean Whitton ^ permalink raw reply [flat|nested] 6+ messages in thread
* fix for parsing bracketed expression 2022-02-23 15:20 Parenthesised query breaks query it is embedded in Sean Whitton @ 2022-02-25 2:41 ` David Bremner 2022-02-25 2:41 ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner 2022-02-25 2:41 ` [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions David Bremner 0 siblings, 2 replies; 6+ messages in thread From: David Bremner @ 2022-02-25 2:41 UTC (permalink / raw) To: Sean Whitton, notmuch This is not a complete fix, which is hard because of the way we implement regular expressions. Sean's original examples still won't work, but hopefully the tests in the second patch show how to make something similar work. This is probably a good time to mention that this kind of thing is easier in the sexp query parser. d ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] test: known broken tests for bracketed terms in subject 2022-02-25 2:41 ` fix for parsing bracketed expression David Bremner @ 2022-02-25 2:41 ` David Bremner 2022-03-19 10:38 ` David Bremner 2022-02-25 2:41 ` [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions David Bremner 1 sibling, 1 reply; 6+ messages in thread From: David Bremner @ 2022-02-25 2:41 UTC (permalink / raw) To: Sean Whitton, notmuch The heuristics in the field processor currently incorrectly trigger phrase parsing. --- test/T650-regexp-query.sh | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/test/T650-regexp-query.sh b/test/T650-regexp-query.sh index 55dc6c88..4ee6b171 100755 --- a/test/T650-regexp-query.sh +++ b/test/T650-regexp-query.sh @@ -65,6 +65,24 @@ thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; - (inbox unread) EOF test_expect_equal_file EXPECTED OUTPUT +test_begin_subtest "bracketed subject search (with dquotes)" +test_subtest_known_broken +notmuch search subject:notmuch and subject:show > EXPECTED +notmuch search 'subject:"(show notmuch)"' > OUTPUT +test_expect_equal_file_nonempty EXPECTED OUTPUT + +test_begin_subtest "bracketed subject search (with dquotes and operator 'or')" +test_subtest_known_broken +notmuch search subject:notmuch or subject:show > EXPECTED +notmuch search 'subject:"(notmuch or show)"' > OUTPUT +test_expect_equal_file_nonempty EXPECTED OUTPUT + +test_begin_subtest "bracketed subject search (with dquotes and operator 'and')" +test_subtest_known_broken +notmuch search subject:notmuch and subject:show > EXPECTED +notmuch search 'subject:"(notmuch and show)"' > OUTPUT +test_expect_equal_file_nonempty EXPECTED OUTPUT + test_begin_subtest "xapian wildcard search for from:" notmuch search --output=messages 'from:cwo*' > OUTPUT test_expect_equal_file cworth.msg-ids OUTPUT -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] test: known broken tests for bracketed terms in subject 2022-02-25 2:41 ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner @ 2022-03-19 10:38 ` David Bremner 2022-03-20 16:29 ` Sean Whitton 0 siblings, 1 reply; 6+ messages in thread From: David Bremner @ 2022-03-19 10:38 UTC (permalink / raw) To: Sean Whitton, notmuch David Bremner <david@tethera.net> writes: > The heuristics in the field processor currently incorrectly trigger > phrase parsing. I have applied this series to master d ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] test: known broken tests for bracketed terms in subject 2022-03-19 10:38 ` David Bremner @ 2022-03-20 16:29 ` Sean Whitton 0 siblings, 0 replies; 6+ messages in thread From: Sean Whitton @ 2022-03-20 16:29 UTC (permalink / raw) To: David Bremner, notmuch Hello, On Sat 19 Mar 2022 at 07:38am -03, David Bremner wrote: > David Bremner <david@tethera.net> writes: > >> The heuristics in the field processor currently incorrectly trigger >> phrase parsing. > > I have applied this series to master Nice, thanks! -- Sean Whitton ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions 2022-02-25 2:41 ` fix for parsing bracketed expression David Bremner 2022-02-25 2:41 ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner @ 2022-02-25 2:41 ` David Bremner 1 sibling, 0 replies; 6+ messages in thread From: David Bremner @ 2022-02-25 2:41 UTC (permalink / raw) To: Sean Whitton, notmuch Since Xapian does not preserve quotes when passing the subquery to a field processor, we have to make a guess as to what the user intended. Here the added assumption is that a string surrounded by parens is not intended to be a phrase. --- doc/man7/notmuch-search-terms.rst | 6 ++++-- lib/regexp-fields.cc | 3 ++- test/T650-regexp-query.sh | 13 ++++++++++--- 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst index e80cc7d0..f8ad1edb 100644 --- a/doc/man7/notmuch-search-terms.rst +++ b/doc/man7/notmuch-search-terms.rst @@ -275,11 +275,13 @@ the same phrase. - a.list.of.words Both parenthesised lists of terms and quoted phrases are ok with -probabilistic prefixes such as **to:**, **from:**, and **subject:**. In particular +probabilistic prefixes such as **to:**, **from:**, and **subject:**. +For prefixes supporting regex search, the parenthesised list should be +quoted. In particular :: - subject:(pizza free) + subject:"(pizza free)" is equivalent to diff --git a/lib/regexp-fields.cc b/lib/regexp-fields.cc index 7e9d959c..539915d8 100644 --- a/lib/regexp-fields.cc +++ b/lib/regexp-fields.cc @@ -227,7 +227,8 @@ RegexpFieldProcessor::operator() (const std::string & str) * phrase parsing, when possible */ std::string query_str; - if (*str.rbegin () != '*' || str.find (' ') != std::string::npos) + if ((str.at (0) != '(' || *str.rbegin () != ')') && + (*str.rbegin () != '*' || str.find (' ') != std::string::npos)) query_str = '"' + str + '"'; else query_str = str; diff --git a/test/T650-regexp-query.sh b/test/T650-regexp-query.sh index 4ee6b171..a9844501 100755 --- a/test/T650-regexp-query.sh +++ b/test/T650-regexp-query.sh @@ -66,23 +66,30 @@ EOF test_expect_equal_file EXPECTED OUTPUT test_begin_subtest "bracketed subject search (with dquotes)" -test_subtest_known_broken notmuch search subject:notmuch and subject:show > EXPECTED notmuch search 'subject:"(show notmuch)"' > OUTPUT test_expect_equal_file_nonempty EXPECTED OUTPUT test_begin_subtest "bracketed subject search (with dquotes and operator 'or')" -test_subtest_known_broken notmuch search subject:notmuch or subject:show > EXPECTED notmuch search 'subject:"(notmuch or show)"' > OUTPUT test_expect_equal_file_nonempty EXPECTED OUTPUT test_begin_subtest "bracketed subject search (with dquotes and operator 'and')" -test_subtest_known_broken notmuch search subject:notmuch and subject:show > EXPECTED notmuch search 'subject:"(notmuch and show)"' > OUTPUT test_expect_equal_file_nonempty EXPECTED OUTPUT +test_begin_subtest "bracketed subject search (with phrase, operator 'or')" +notmuch search 'subject:"mailing list"' or subject:FreeBSD > EXPECTED +notmuch search 'subject:"(""mailing list"" or FreeBSD)"' > OUTPUT +test_expect_equal_file_nonempty EXPECTED OUTPUT + +test_begin_subtest "bracketed subject search (with phrase, operator 'and')" +notmuch search search 'subject:"notmuch show"' and subject:commands > EXPECTED +notmuch search 'subject:"(""notmuch show"" and commands)"' > OUTPUT +test_expect_equal_file_nonempty EXPECTED OUTPUT + test_begin_subtest "xapian wildcard search for from:" notmuch search --output=messages 'from:cwo*' > OUTPUT test_expect_equal_file cworth.msg-ids OUTPUT -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-03-20 16:29 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-02-23 15:20 Parenthesised query breaks query it is embedded in Sean Whitton 2022-02-25 2:41 ` fix for parsing bracketed expression David Bremner 2022-02-25 2:41 ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner 2022-03-19 10:38 ` David Bremner 2022-03-20 16:29 ` Sean Whitton 2022-02-25 2:41 ` [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions David Bremner
Code repositories for project(s) associated with this public inbox https://yhetil.org/notmuch.git/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).