unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Parenthesised query breaks query it is embedded in
@ 2022-02-23 15:20 Sean Whitton
  2022-02-25  2:41 ` fix for parsing bracketed expression David Bremner
  0 siblings, 1 reply; 6+ messages in thread
From: Sean Whitton @ 2022-02-23 15:20 UTC (permalink / raw)
  To: notmuch

Hello,

I have this subquery:

    subject:("Cron <dak@" and "sudo -u dak-unpriv /srv/")

as one of many queries in a long one like this: (q1) or (q2) or (q3) ..

Including that particular subquery breaks the whole thing: results which
should be included by other disjuncts are not included.  I suspect but
have not confirmed that each disjunct after the problematic one stops
matching anything.

The query works fine on its own.  And I can work around the problem by
replacing it with

    subject:"Cron <dak@" and subject:"sudo -u dak-unpriv /srv/"

Seems like a parsing bug.  Thanks.

-- 
Sean Whitton

^ permalink raw reply	[flat|nested] 6+ messages in thread

* fix for parsing bracketed expression
  2022-02-23 15:20 Parenthesised query breaks query it is embedded in Sean Whitton
@ 2022-02-25  2:41 ` David Bremner
  2022-02-25  2:41   ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner
  2022-02-25  2:41   ` [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions David Bremner
  0 siblings, 2 replies; 6+ messages in thread
From: David Bremner @ 2022-02-25  2:41 UTC (permalink / raw)
  To: Sean Whitton, notmuch

This is not a complete fix, which is hard because of the way we
implement regular expressions. Sean's original examples still won't
work, but hopefully the tests in the second patch show how to make
something similar work. This is probably a good time to mention that
this kind of thing is easier in the sexp query parser.

d


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] test: known broken tests for bracketed terms in subject
  2022-02-25  2:41 ` fix for parsing bracketed expression David Bremner
@ 2022-02-25  2:41   ` David Bremner
  2022-03-19 10:38     ` David Bremner
  2022-02-25  2:41   ` [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions David Bremner
  1 sibling, 1 reply; 6+ messages in thread
From: David Bremner @ 2022-02-25  2:41 UTC (permalink / raw)
  To: Sean Whitton, notmuch

The heuristics in the field processor currently incorrectly trigger
phrase parsing.
---
 test/T650-regexp-query.sh | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/test/T650-regexp-query.sh b/test/T650-regexp-query.sh
index 55dc6c88..4ee6b171 100755
--- a/test/T650-regexp-query.sh
+++ b/test/T650-regexp-query.sh
@@ -65,6 +65,24 @@ thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; - (inbox unread)
 EOF
 test_expect_equal_file EXPECTED OUTPUT
 
+test_begin_subtest "bracketed subject search (with dquotes)"
+test_subtest_known_broken
+notmuch search subject:notmuch and subject:show > EXPECTED
+notmuch search 'subject:"(show notmuch)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
+test_begin_subtest "bracketed subject search (with dquotes and operator 'or')"
+test_subtest_known_broken
+notmuch search subject:notmuch or subject:show > EXPECTED
+notmuch search 'subject:"(notmuch or show)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
+test_begin_subtest "bracketed subject search (with dquotes and operator 'and')"
+test_subtest_known_broken
+notmuch search subject:notmuch and subject:show > EXPECTED
+notmuch search 'subject:"(notmuch and show)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
 test_begin_subtest "xapian wildcard search for from:"
 notmuch search --output=messages 'from:cwo*' > OUTPUT
 test_expect_equal_file cworth.msg-ids OUTPUT
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions
  2022-02-25  2:41 ` fix for parsing bracketed expression David Bremner
  2022-02-25  2:41   ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner
@ 2022-02-25  2:41   ` David Bremner
  1 sibling, 0 replies; 6+ messages in thread
From: David Bremner @ 2022-02-25  2:41 UTC (permalink / raw)
  To: Sean Whitton, notmuch

Since Xapian does not preserve quotes when passing the subquery to a
field processor, we have to make a guess as to what the user
intended. Here the added assumption is that a string surrounded by
parens is not intended to be a phrase.
---
 doc/man7/notmuch-search-terms.rst |  6 ++++--
 lib/regexp-fields.cc              |  3 ++-
 test/T650-regexp-query.sh         | 13 ++++++++++---
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst
index e80cc7d0..f8ad1edb 100644
--- a/doc/man7/notmuch-search-terms.rst
+++ b/doc/man7/notmuch-search-terms.rst
@@ -275,11 +275,13 @@ the same phrase.
 - a.list.of.words
 
 Both parenthesised lists of terms and quoted phrases are ok with
-probabilistic prefixes such as **to:**, **from:**, and **subject:**. In particular
+probabilistic prefixes such as **to:**, **from:**, and **subject:**.
+For prefixes supporting regex search, the parenthesised list should be
+quoted.  In particular
 
 ::
 
-   subject:(pizza free)
+   subject:"(pizza free)"
 
 is equivalent to
 
diff --git a/lib/regexp-fields.cc b/lib/regexp-fields.cc
index 7e9d959c..539915d8 100644
--- a/lib/regexp-fields.cc
+++ b/lib/regexp-fields.cc
@@ -227,7 +227,8 @@ RegexpFieldProcessor::operator() (const std::string & str)
 	     * phrase parsing, when possible */
 	    std::string query_str;
 
-	    if (*str.rbegin () != '*' || str.find (' ') != std::string::npos)
+	    if ((str.at (0) != '(' || *str.rbegin () != ')') &&
+		(*str.rbegin () != '*' || str.find (' ') != std::string::npos))
 		query_str = '"' + str + '"';
 	    else
 		query_str = str;
diff --git a/test/T650-regexp-query.sh b/test/T650-regexp-query.sh
index 4ee6b171..a9844501 100755
--- a/test/T650-regexp-query.sh
+++ b/test/T650-regexp-query.sh
@@ -66,23 +66,30 @@ EOF
 test_expect_equal_file EXPECTED OUTPUT
 
 test_begin_subtest "bracketed subject search (with dquotes)"
-test_subtest_known_broken
 notmuch search subject:notmuch and subject:show > EXPECTED
 notmuch search 'subject:"(show notmuch)"' > OUTPUT
 test_expect_equal_file_nonempty EXPECTED OUTPUT
 
 test_begin_subtest "bracketed subject search (with dquotes and operator 'or')"
-test_subtest_known_broken
 notmuch search subject:notmuch or subject:show > EXPECTED
 notmuch search 'subject:"(notmuch or show)"' > OUTPUT
 test_expect_equal_file_nonempty EXPECTED OUTPUT
 
 test_begin_subtest "bracketed subject search (with dquotes and operator 'and')"
-test_subtest_known_broken
 notmuch search subject:notmuch and subject:show > EXPECTED
 notmuch search 'subject:"(notmuch and show)"' > OUTPUT
 test_expect_equal_file_nonempty EXPECTED OUTPUT
 
+test_begin_subtest "bracketed subject search (with phrase, operator 'or')"
+notmuch search 'subject:"mailing list"' or subject:FreeBSD > EXPECTED
+notmuch search  'subject:"(""mailing list"" or FreeBSD)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
+test_begin_subtest "bracketed subject search (with phrase, operator 'and')"
+notmuch search  search 'subject:"notmuch show"' and subject:commands > EXPECTED
+notmuch search  'subject:"(""notmuch show"" and commands)"' > OUTPUT
+test_expect_equal_file_nonempty EXPECTED OUTPUT
+
 test_begin_subtest "xapian wildcard search for from:"
 notmuch search --output=messages 'from:cwo*' > OUTPUT
 test_expect_equal_file cworth.msg-ids OUTPUT
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] test: known broken tests for bracketed terms in subject
  2022-02-25  2:41   ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner
@ 2022-03-19 10:38     ` David Bremner
  2022-03-20 16:29       ` Sean Whitton
  0 siblings, 1 reply; 6+ messages in thread
From: David Bremner @ 2022-03-19 10:38 UTC (permalink / raw)
  To: Sean Whitton, notmuch

David Bremner <david@tethera.net> writes:

> The heuristics in the field processor currently incorrectly trigger
> phrase parsing.

I have applied this series to master

d

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] test: known broken tests for bracketed terms in subject
  2022-03-19 10:38     ` David Bremner
@ 2022-03-20 16:29       ` Sean Whitton
  0 siblings, 0 replies; 6+ messages in thread
From: Sean Whitton @ 2022-03-20 16:29 UTC (permalink / raw)
  To: David Bremner, notmuch

Hello,

On Sat 19 Mar 2022 at 07:38am -03, David Bremner wrote:

> David Bremner <david@tethera.net> writes:
>
>> The heuristics in the field processor currently incorrectly trigger
>> phrase parsing.
>
> I have applied this series to master

Nice, thanks!

-- 
Sean Whitton

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-03-20 16:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-23 15:20 Parenthesised query breaks query it is embedded in Sean Whitton
2022-02-25  2:41 ` fix for parsing bracketed expression David Bremner
2022-02-25  2:41   ` [PATCH 1/2] test: known broken tests for bracketed terms in subject David Bremner
2022-03-19 10:38     ` David Bremner
2022-03-20 16:29       ` Sean Whitton
2022-02-25  2:41   ` [PATCH 2/2] lib: do not phrase parse prefixed bracketed subexpressions David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).