unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Jani Nikula <jani@nikula.org>
To: David Bremner <david@tethera.net>, notmuch@notmuchmail.org
Subject: Re: [PATCH 1/4] lib: add thread subqueries.
Date: Sun, 06 May 2018 20:59:24 +0300	[thread overview]
Message-ID: <87d0y8k6mb.fsf@nikula.org> (raw)
In-Reply-To: <20180505160523.9749-2-david@tethera.net>

On Sat, 05 May 2018, David Bremner <david@tethera.net> wrote:
> This change allows queries of the form
>
>  thread:{from:me} and thread:{from:jian} and not thread:{from:dave}
>
> This is still somewhat brute-force, but it's a big improvement over
> both the shell script solution and the previous proposal [1], because it
> does not build the whole thread structure just generate a
> query. A further potential optimization is to replace the calls to
> notmuch with more specialized Xapian code; in particular it's not
> likely that reading all of the message metadata is a win here.
>
> [1]: id:20170820213240.20526-1-david@tethera.net
> ---
>  lib/Makefile.local           |  3 +-
>  lib/database.cc              |  6 +++-
>  lib/thread-fp.cc             | 67 ++++++++++++++++++++++++++++++++++++
>  lib/thread-fp.h              | 42 ++++++++++++++++++++++
>  test/T585-thread-subquery.sh | 46 +++++++++++++++++++++++++
>  5 files changed, 162 insertions(+), 2 deletions(-)
>  create mode 100644 lib/thread-fp.cc
>  create mode 100644 lib/thread-fp.h
>  create mode 100755 test/T585-thread-subquery.sh
>
> diff --git a/lib/Makefile.local b/lib/Makefile.local
> index 8aa03891..5dc057c0 100644
> --- a/lib/Makefile.local
> +++ b/lib/Makefile.local
> @@ -58,7 +58,8 @@ libnotmuch_cxx_srcs =		\
>  	$(dir)/query-fp.cc      \
>  	$(dir)/config.cc	\
>  	$(dir)/regexp-fields.cc	\
> -	$(dir)/thread.cc
> +	$(dir)/thread.cc \
> +	$(dir)/thread-fp.cc
>  
>  libnotmuch_modules := $(libnotmuch_c_srcs:.c=.o) $(libnotmuch_cxx_srcs:.cc=.o)
>  
> diff --git a/lib/database.cc b/lib/database.cc
> index 02444e09..9cf8062c 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -21,6 +21,7 @@
>  #include "database-private.h"
>  #include "parse-time-vrp.h"
>  #include "query-fp.h"
> +#include "thread-fp.h"
>  #include "regexp-fields.h"
>  #include "string-util.h"
>  
> @@ -258,7 +259,8 @@ prefix_t prefix_table[] = {
>      { "directory",		"XDIRECTORY",	NOTMUCH_FIELD_NO_FLAGS },
>      { "file-direntry",		"XFDIRENTRY",	NOTMUCH_FIELD_NO_FLAGS },
>      { "directory-direntry",	"XDDIRENTRY",	NOTMUCH_FIELD_NO_FLAGS },
> -    { "thread",			"G",		NOTMUCH_FIELD_EXTERNAL },
> +    { "thread",			"G",		NOTMUCH_FIELD_EXTERNAL |
> +						NOTMUCH_FIELD_PROCESSOR },
>      { "tag",			"K",		NOTMUCH_FIELD_EXTERNAL |
>  						NOTMUCH_FIELD_PROCESSOR },
>      { "is",			"K",		NOTMUCH_FIELD_EXTERNAL |
> @@ -317,6 +319,8 @@ _setup_query_field (const prefix_t *prefix, notmuch_database_t *notmuch)
>  	    fp = (new DateFieldProcessor())->release ();
>  	else if (STRNCMP_LITERAL(prefix->name, "query") == 0)
>  	    fp = (new QueryFieldProcessor (*notmuch->query_parser, notmuch))->release ();
> +	else if (STRNCMP_LITERAL(prefix->name, "thread") == 0)
> +	    fp = (new ThreadFieldProcessor (*notmuch->query_parser, notmuch))->release ();
>  	else
>  	    fp = (new RegexpFieldProcessor (prefix->name, prefix->flags,
>  					    *notmuch->query_parser, notmuch))->release ();
> diff --git a/lib/thread-fp.cc b/lib/thread-fp.cc
> new file mode 100644
> index 00000000..dd292bf6
> --- /dev/null
> +++ b/lib/thread-fp.cc
> @@ -0,0 +1,67 @@
> +/* thread-fp.cc - "thread:" field processor glue
> + *
> + * This file is part of notmuch.
> + *
> + * Copyright © 2018 David Bremner
> + *
> + * This program is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see https://www.gnu.org/licenses/ .
> + *
> + * Author: David Bremner <david@tethera.net>
> + */
> +
> +#include "database-private.h"
> +#include "thread-fp.h"
> +#include <iostream>
> +
> +#if HAVE_XAPIAN_FIELD_PROCESSOR
> +
> +Xapian::Query
> +ThreadFieldProcessor::operator() (const std::string & str)
> +{
> +    notmuch_status_t status;
> +    const char *thread_prefix = _find_prefix ("thread");
> +
> +    if (str.at (0) == '{') {
> +	if (str.length () > 1 && str.at (str.size () - 1) == '}') {

IIUC .length() and .size() are the same thing, but it's confusing to see
them both used on the same line.

Nitpick, I always favor dealing with error cases first, so you can do
the happy day scenario with less indent. So I'd check the opposite,
throw the error, and continue without the else. YMMV.

Otherwise, LGTM.

> +	    std::string subquery_str = str.substr (1, str.size () - 2);
> +	    notmuch_query_t *subquery = notmuch_query_create (notmuch, subquery_str.c_str ());
> +	    notmuch_messages_t *messages;
> +	    std::set<std::string> terms;
> +
> +	    if (! subquery)
> +		throw Xapian::QueryParserError ("failed to create subquery for '" + subquery_str + "'");
> +
> +	    status = notmuch_query_search_messages (subquery, &messages);
> +	    if (status)
> +		throw Xapian::QueryParserError ("failed to search messages for '" + subquery_str + "'");
> +
> +	    for (; notmuch_messages_valid (messages); notmuch_messages_move_to_next (messages)) {
> +		std::string term = thread_prefix;
> +		notmuch_message_t *message;
> +		message = notmuch_messages_get (messages);
> +		term += notmuch_message_get_thread_id (message);
> +		terms.insert (term);
> +	    }
> +	    return Xapian::Query (Xapian::Query::OP_OR, terms.begin (), terms.end ());
> +	} else {
> +	    throw Xapian::QueryParserError ("missing } in '" + str + "'");
> +	}
> +    } else {
> +	/* literal thread id */
> +	std::string term = thread_prefix + str;
> +	return Xapian::Query (term);
> +    }
> +
> +}
> +#endif
> diff --git a/lib/thread-fp.h b/lib/thread-fp.h
> new file mode 100644
> index 00000000..13725978
> --- /dev/null
> +++ b/lib/thread-fp.h
> @@ -0,0 +1,42 @@
> +/* thread-fp.h - thread field processor glue
> + *
> + * This file is part of notmuch.
> + *
> + * Copyright © 2017 David Bremner
> + *
> + * This program is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see https://www.gnu.org/licenses/ .
> + *
> + * Author: David Bremner <david@tethera.net>
> + */
> +
> +#ifndef NOTMUCH_THREAD_FP_H
> +#define NOTMUCH_THREAD_FP_H
> +
> +#include <xapian.h>
> +#include "notmuch.h"
> +
> +#if HAVE_XAPIAN_FIELD_PROCESSOR
> +class ThreadFieldProcessor : public Xapian::FieldProcessor {
> + protected:
> +    Xapian::QueryParser &parser;
> +    notmuch_database_t *notmuch;
> +
> + public:
> +    ThreadFieldProcessor (Xapian::QueryParser &parser_, notmuch_database_t *notmuch_)
> +	: parser(parser_), notmuch(notmuch_) { };
> +
> +    Xapian::Query operator()(const std::string & str);
> +};
> +#endif
> +#endif /* NOTMUCH_THREAD_FP_H */
> diff --git a/test/T585-thread-subquery.sh b/test/T585-thread-subquery.sh
> new file mode 100755
> index 00000000..71ced149
> --- /dev/null
> +++ b/test/T585-thread-subquery.sh
> @@ -0,0 +1,46 @@
> +#!/usr/bin/env bash
> +#
> +# Copyright (c) 2018 David Bremner
> +#
> +
> +test_description='test of searching by using thread subqueries'
> +
> +. $(dirname "$0")/test-lib.sh || exit 1
> +
> +add_email_corpus
> +
> +test_begin_subtest "Basic query that matches no messages"
> +count=$(notmuch count from:keithp and to:keithp)
> +test_expect_equal 0 "$count"
> +
> +test_begin_subtest "Same query against threads"
> +notmuch search thread:{from:keithp} and thread:{to:keithp} | notmuch_search_sanitize > OUTPUT
> +cat<<EOF > EXPECTED
> +thread:XXX   2009-11-18 [7/7] Lars Kellogg-Stedman, Mikhail Gusarov, Keith Packard, Carl Worth; [notmuch] Working with Maildir storage? (inbox signed unread)
> +EOF
> +test_expect_equal_file EXPECTED OUTPUT
> +
> +test_begin_subtest "Mix thread and non-threads query"
> +notmuch search thread:{from:keithp} and to:keithp | notmuch_search_sanitize > OUTPUT
> +cat<<EOF > EXPECTED
> +thread:XXX   2009-11-18 [1/7] Lars Kellogg-Stedman| Mikhail Gusarov, Keith Packard, Carl Worth; [notmuch] Working with Maildir storage? (inbox signed unread)
> +EOF
> +test_expect_equal_file EXPECTED OUTPUT
> +
> +test_begin_subtest "Compound subquery"
> +notmuch search 'thread:"{from:keithp and date:2009}" and thread:{to:keithp}' | notmuch_search_sanitize > OUTPUT
> +cat<<EOF > EXPECTED
> +thread:XXX   2009-11-18 [7/7] Lars Kellogg-Stedman, Mikhail Gusarov, Keith Packard, Carl Worth; [notmuch] Working with Maildir storage? (inbox signed unread)
> +EOF
> +test_expect_equal_file EXPECTED OUTPUT
> +
> +test_begin_subtest "Syntax/quoting error in subquery"
> +notmuch search 'thread:{from:keithp and date:2009} and thread:{to:keithp}' 1>OUTPUT 2>&1
> +cat<<EOF > EXPECTED
> +notmuch search: A Xapian exception occurred
> +A Xapian exception occurred parsing query: missing } in '{from:keithp'
> +Query string was: thread:{from:keithp and date:2009} and thread:{to:keithp}
> +EOF
> +test_expect_equal_file EXPECTED OUTPUT
> +
> +test_done
> -- 
> 2.17.0
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> https://notmuchmail.org/mailman/listinfo/notmuch

  reply	other threads:[~2018-05-06 17:59 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-05 16:05 Thread subqueries David Bremner
2018-05-05 16:05 ` [PATCH 1/4] lib: add thread subqueries David Bremner
2018-05-06 17:59   ` Jani Nikula [this message]
2018-05-05 16:05 ` [PATCH 2/4] perf-test: add simple test for " David Bremner
2018-05-05 16:05 ` [PATCH 3/4] lib: define specialized get_thread_id for use in thread subquery David Bremner
2018-05-06 18:03   ` Jani Nikula
2018-05-05 16:05 ` [PATCH 4/4] doc: document thread subqueries David Bremner
2018-05-06 18:05   ` Jani Nikula
2018-05-07 12:09 ` Thread subqueries David Bremner
2018-05-07 12:39   ` Gaute Hope
2018-05-11  8:43   ` Daniel Kahn Gillmor
2018-05-11 10:15     ` David Bremner
2018-05-11 15:41       ` Daniel Kahn Gillmor
2018-05-12 13:24         ` Tomi Ollila
2018-05-12 14:01           ` David Bremner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d0y8k6mb.fsf@nikula.org \
    --to=jani@nikula.org \
    --cc=david@tethera.net \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).