unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [WIP 0/2] Thread based searching
@ 2012-03-20 20:11 Mark Walters
  2012-03-20 20:11 ` [WIP 1/2] lib: multithread Mark Walters
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Mark Walters @ 2012-03-20 20:11 UTC (permalink / raw)
  To: notmuch

This is very definitely only a work in progress but there have been
several requests for this or something similar on irc.

It implements a very crude form of thread based search: the user can
ask for all the threads that have a message matching the primary query
and have a (possibly different) message matching a secondary query.

For example notmuch search --secondary-search from:A :AND: from:B
returns all threads that have a message from A and a message from B.

Similarly it allows the user to say "and not" on a thread based way.

For example notmuch search --secondary-search from:A :AND_NOT:
tag:mute returns all threads that have a message from A and no message
with tag:mute.

Anything allowing queries of this form is going to have to do some
parsing of the query (rather than leaving this to xapian). To keep
things as simple as possible this version only tries parsing of this
form if passed --secondary-search and it assumes the last command line
argument is the entire secondary search (so any complex secondary
search should be independently quoted) and that the penultimate
command line argument is either :AND: or :AND_NOT: for thread based
"and" or thread based "and not" respectively.

Finally, the two queries do play a different role even in the :AND:
case. The threads returned are exactly those that match the primary
query in (the order that would normally give) just filtered by
containing a message matching the secondary query. Thus a search for
query_a :AND query_b and a search for query_b :AND: query_a return
threads in different orders, and one may be much faster than the
other.

At the moment this is purely lib and cli (ie no emacs interface).

It is also not heavily tested and interactions with excludes or
anything unusual could easily give strange results.

Anyway I am just posting this in case anyone is interested.

Best wishes

Mark

Mark Walters (2):
  lib: multithread
  cli: search: multithread

 lib/notmuch.h    |   12 ++++++++++++
 lib/query.cc     |   46 +++++++++++++++++++++++++++++++++++++++++++++-
 notmuch-search.c |   19 +++++++++++++++++++
 3 files changed, 76 insertions(+), 1 deletions(-)

-- 
1.7.9.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [WIP 1/2] lib: multithread
  2012-03-20 20:11 [WIP 0/2] Thread based searching Mark Walters
@ 2012-03-20 20:11 ` Mark Walters
  2012-03-20 20:12 ` [WIP 2/2] cli: search: multithread Mark Walters
  2012-04-14 19:15 ` [WIP 0/2] Thread based searching Jameson Graef Rollins
  2 siblings, 0 replies; 4+ messages in thread
From: Mark Walters @ 2012-03-20 20:11 UTC (permalink / raw)
  To: notmuch

---
 lib/notmuch.h |   12 ++++++++++++
 lib/query.cc  |   46 +++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 57 insertions(+), 1 deletions(-)

diff --git a/lib/notmuch.h b/lib/notmuch.h
index babd208..22dd69e 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -445,6 +445,18 @@ typedef enum {
     NOTMUCH_SORT_UNSORTED
 } notmuch_sort_t;
 
+/* Values for notmuch_query_secondary_search_conjunction */
+typedef enum {
+    NOTMUCH_SECONDARY_SEARCH_NONE,
+    NOTMUCH_SECONDARY_SEARCH_AND,
+    NOTMUCH_SECONDARY_SEARCH_AND_NOT
+} notmuch_ss_conjunction_t;
+
+void
+notmuch_query_set_secondary_search (notmuch_query_t *query,
+				    const char *secondary_search_terms,
+				    notmuch_ss_conjunction_t conjunction);
+
 /* Return the query_string of this query. See notmuch_query_create. */
 const char *
 notmuch_query_get_query_string (notmuch_query_t *query);
diff --git a/lib/query.cc b/lib/query.cc
index 68ac1e4..0faef51 100644
--- a/lib/query.cc
+++ b/lib/query.cc
@@ -26,6 +26,8 @@
 struct _notmuch_query {
     notmuch_database_t *notmuch;
     const char *query_string;
+    const char *secondary_search_terms;
+    notmuch_ss_conjunction_t ss_conjunction;
     notmuch_sort_t sort;
     notmuch_string_list_t *exclude_terms;
     notmuch_bool_t omit_excluded_messages;
@@ -92,6 +94,8 @@ notmuch_query_create (notmuch_database_t *notmuch,
 
     query->exclude_terms = _notmuch_string_list_create (query);
 
+    query->secondary_search_terms = NULL;
+
     query->omit_excluded_messages = FALSE;
 
     return query;
@@ -122,6 +126,15 @@ notmuch_query_get_sort (notmuch_query_t *query)
 }
 
 void
+notmuch_query_set_secondary_search (notmuch_query_t *query,
+				    const char *secondary_search_terms,
+				    notmuch_ss_conjunction_t conjunction)
+{
+    query->secondary_search_terms = talloc_strdup (query, secondary_search_terms);
+    query->ss_conjunction = conjunction;
+}
+
+void
 notmuch_query_add_tag_exclude (notmuch_query_t *query, const char *tag)
 {
     char *term = talloc_asprintf (query, "%s%s", _find_prefix ("tag"), tag);
@@ -454,6 +467,36 @@ notmuch_query_destroy (notmuch_query_t *query)
     talloc_free (query);
 }
 
+/* This function tests whether the thread containing document with id
+ * seed_doc_id satisfies the secondary search terms of query.*/
+notmuch_bool_t
+notmuch_thread_test_secondary_search (notmuch_query_t *query, unsigned int seed_doc_id)
+{
+    int count;
+    notmuch_message_t *seed_message;
+    const char *thread_id;
+    char *thread_id_query_string;
+    notmuch_query_t *thread_id_query;
+
+    if (!query->secondary_search_terms) return TRUE;
+    seed_message = _notmuch_message_create (query, query->notmuch, seed_doc_id, NULL);
+
+    thread_id = notmuch_message_get_thread_id (seed_message);
+    thread_id_query_string = talloc_asprintf (query, "thread:%s and %s",
+					      thread_id,
+					      query->secondary_search_terms);
+    thread_id_query = notmuch_query_create (query->notmuch, thread_id_query_string);
+    count = notmuch_query_count_messages (thread_id_query);
+    switch (query->ss_conjunction) {
+    case NOTMUCH_SECONDARY_SEARCH_AND:
+	return (count > 0);
+    case NOTMUCH_SECONDARY_SEARCH_AND_NOT:
+	return (count == 0);
+    default:
+	return TRUE;
+    }
+}
+
 notmuch_bool_t
 notmuch_threads_valid (notmuch_threads_t *threads)
 {
@@ -462,7 +505,8 @@ notmuch_threads_valid (notmuch_threads_t *threads)
     while (threads->doc_id_pos < threads->doc_ids->len) {
 	doc_id = g_array_index (threads->doc_ids, unsigned int,
 				threads->doc_id_pos);
-	if (_notmuch_doc_id_set_contains (&threads->match_set, doc_id))
+	if (_notmuch_doc_id_set_contains (&threads->match_set, doc_id) &&
+	    notmuch_thread_test_secondary_search (threads->query, doc_id))
 	    break;
 
 	threads->doc_id_pos++;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [WIP 2/2] cli: search: multithread
  2012-03-20 20:11 [WIP 0/2] Thread based searching Mark Walters
  2012-03-20 20:11 ` [WIP 1/2] lib: multithread Mark Walters
@ 2012-03-20 20:12 ` Mark Walters
  2012-04-14 19:15 ` [WIP 0/2] Thread based searching Jameson Graef Rollins
  2 siblings, 0 replies; 4+ messages in thread
From: Mark Walters @ 2012-03-20 20:12 UTC (permalink / raw)
  To: notmuch

---
 notmuch-search.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index f6061e4..2fa4231 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -436,6 +436,9 @@ notmuch_search_command (void *ctx, int argc, char *argv[])
     int offset = 0;
     int limit = -1; /* unlimited */
     notmuch_bool_t no_exclude = FALSE;
+    notmuch_bool_t secondary_search = FALSE;
+    char *secondary_search_terms = NULL;
+    notmuch_ss_conjunction_t ss_conjunction = NOTMUCH_SECONDARY_SEARCH_NONE;
     unsigned int i;
 
     enum { NOTMUCH_FORMAT_JSON, NOTMUCH_FORMAT_TEXT }
@@ -458,6 +461,7 @@ notmuch_search_command (void *ctx, int argc, char *argv[])
 				  { "tags", OUTPUT_TAGS },
 				  { 0, 0 } } },
 	{ NOTMUCH_OPT_BOOLEAN, &no_exclude, "no-exclude", 'd', 0 },
+	{ NOTMUCH_OPT_BOOLEAN, &secondary_search, "secondary-search", 'd', 0 },
 	{ NOTMUCH_OPT_INT, &offset, "offset", 'O', 0 },
 	{ NOTMUCH_OPT_INT, &limit, "limit", 'L', 0  },
 	{ 0, 0, 0, 0, 0 }
@@ -478,6 +482,18 @@ notmuch_search_command (void *ctx, int argc, char *argv[])
 	break;
     }
 
+    if (secondary_search && argc-opt_index >= 2 ) {
+	if (strcmp (argv[argc - 2], ":AND:") == 0)
+	    ss_conjunction = NOTMUCH_SECONDARY_SEARCH_AND;
+	if (strcmp (argv[argc - 2], ":AND_NOT:") == 0)
+	    ss_conjunction = NOTMUCH_SECONDARY_SEARCH_AND_NOT;
+
+	if (ss_conjunction != NOTMUCH_SECONDARY_SEARCH_NONE) {
+	    secondary_search_terms = argv[argc - 1];
+	    argc -= 2;
+	}
+    }
+
     config = notmuch_config_open (ctx, NULL, NULL);
     if (config == NULL)
 	return 1;
@@ -505,6 +521,9 @@ notmuch_search_command (void *ctx, int argc, char *argv[])
 
     notmuch_query_set_sort (query, sort);
 
+    if (secondary_search_terms)
+	notmuch_query_set_secondary_search (query, secondary_search_terms, ss_conjunction);
+
     if (!no_exclude) {
 	const char **search_exclude_tags;
 	size_t search_exclude_tags_length;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [WIP 0/2] Thread based searching
  2012-03-20 20:11 [WIP 0/2] Thread based searching Mark Walters
  2012-03-20 20:11 ` [WIP 1/2] lib: multithread Mark Walters
  2012-03-20 20:12 ` [WIP 2/2] cli: search: multithread Mark Walters
@ 2012-04-14 19:15 ` Jameson Graef Rollins
  2 siblings, 0 replies; 4+ messages in thread
From: Jameson Graef Rollins @ 2012-04-14 19:15 UTC (permalink / raw)
  To: Mark Walters, notmuch

[-- Attachment #1: Type: text/plain, Size: 1035 bytes --]

On Tue, Mar 20 2012, Mark Walters <markwalters1009@gmail.com> wrote:
> It implements a very crude form of thread based search: the user can
> ask for all the threads that have a message matching the primary query
> and have a (possibly different) message matching a secondary query.
>
> For example notmuch search --secondary-search from:A :AND: from:B
> returns all threads that have a message from A and a message from B.

Hey, Mark.  I think this is a useful idea, but I think the
implementation is more complicated than it needs to be.  I would rather
just have a switch that allows me to search by thread, instead of by
message.  Something like just --by-thread would work for me.

I guess it might be nice to have such fined grain control to specify
independent search terms for both messages and threads simultaneously,
but I really don't think I would ever use such capabilities, and the
additional syntax is a little confusing.  But I certainly would use the
ability to search entire threads instead of just messages.

jamie.

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-04-14 19:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-20 20:11 [WIP 0/2] Thread based searching Mark Walters
2012-03-20 20:11 ` [WIP 1/2] lib: multithread Mark Walters
2012-03-20 20:12 ` [WIP 2/2] cli: search: multithread Mark Walters
2012-04-14 19:15 ` [WIP 0/2] Thread based searching Jameson Graef Rollins

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).