From: David Bremner <david@tethera.net>
To: notmuch@notmuchmail.org
Cc: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Subject: [PATCH 10/10] add "notmuch reindex" subcommand
Date: Fri, 14 Apr 2017 03:14:40 -0000 [thread overview]
Message-ID: <20170414025004.5334-11-david@tethera.net> (raw)
In-Reply-To: <20170414025004.5334-1-david@tethera.net>
From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
This new subcommand takes a set of search terms, and re-indexes the
list of matching messages.
---
Makefile.local | 1 +
doc/conf.py | 4 ++
doc/index.rst | 1 +
doc/man1/notmuch-reindex.rst | 29 +++++++++
doc/man1/notmuch.rst | 4 +-
doc/man7/notmuch-search-terms.rst | 7 +-
notmuch-client.h | 3 +
notmuch-reindex.c | 134 ++++++++++++++++++++++++++++++++++++++
notmuch.c | 2 +
performance-test/M04-reindex.sh | 11 ++++
performance-test/T03-reindex.sh | 13 ++++
test/T670-duplicate-mid.sh | 7 ++
test/T700-reindex.sh | 60 +++++++++++++++++
13 files changed, 272 insertions(+), 4 deletions(-)
create mode 100644 doc/man1/notmuch-reindex.rst
create mode 100644 notmuch-reindex.c
create mode 100755 performance-test/M04-reindex.sh
create mode 100755 performance-test/T03-reindex.sh
create mode 100755 test/T700-reindex.sh
diff --git a/Makefile.local b/Makefile.local
index 03eafaaa..c6e272bc 100644
--- a/Makefile.local
+++ b/Makefile.local
@@ -222,6 +222,7 @@ notmuch_client_srcs = \
notmuch-dump.c \
notmuch-insert.c \
notmuch-new.c \
+ notmuch-reindex.c \
notmuch-reply.c \
notmuch-restore.c \
notmuch-search.c \
diff --git a/doc/conf.py b/doc/conf.py
index a3d82696..aa864b3c 100644
--- a/doc/conf.py
+++ b/doc/conf.py
@@ -95,6 +95,10 @@ man_pages = [
u'incorporate new mail into the notmuch database',
[notmuch_authors], 1),
+ ('man1/notmuch-reindex', 'notmuch-reindex',
+ u're-index matching messages',
+ [notmuch_authors], 1),
+
('man1/notmuch-reply', 'notmuch-reply',
u'constructs a reply template for a set of messages',
[notmuch_authors], 1),
diff --git a/doc/index.rst b/doc/index.rst
index 344606d9..aa6c9f40 100644
--- a/doc/index.rst
+++ b/doc/index.rst
@@ -18,6 +18,7 @@ Contents:
man5/notmuch-hooks
man1/notmuch-insert
man1/notmuch-new
+ man1/notmuch-reindex
man1/notmuch-reply
man1/notmuch-restore
man1/notmuch-search
diff --git a/doc/man1/notmuch-reindex.rst b/doc/man1/notmuch-reindex.rst
new file mode 100644
index 00000000..e39cc4ee
--- /dev/null
+++ b/doc/man1/notmuch-reindex.rst
@@ -0,0 +1,29 @@
+===============
+notmuch-reindex
+===============
+
+SYNOPSIS
+========
+
+**notmuch** **reindex** [*option* ...] <*search-term*> ...
+
+DESCRIPTION
+===========
+
+Re-index all messages matching the search terms.
+
+See **notmuch-search-terms(7)** for details of the supported syntax for
+<*search-term*\ >.
+
+The **reindex** command searches for all messages matching the
+supplied search terms, and re-creates the full-text index on these
+messages using the supplied options.
+
+SEE ALSO
+========
+
+**notmuch(1)**, **notmuch-config(1)**, **notmuch-count(1)**,
+**notmuch-dump(1)**, **notmuch-hooks(5)**, **notmuch-insert(1)**,
+**notmuch-new(1)**,
+**notmuch-reply(1)**, **notmuch-restore(1)**, **notmuch-search(1)**,
+**notmuch-search-terms(7)**, **notmuch-show(1)**, **notmuch-tag(1)**
diff --git a/doc/man1/notmuch.rst b/doc/man1/notmuch.rst
index fbd7f381..b2a8376e 100644
--- a/doc/man1/notmuch.rst
+++ b/doc/man1/notmuch.rst
@@ -149,8 +149,8 @@ SEE ALSO
**notmuch-address(1)**, **notmuch-compact(1)**, **notmuch-config(1)**,
**notmuch-count(1)**, **notmuch-dump(1)**, **notmuch-hooks(5)**,
-**notmuch-insert(1)**, **notmuch-new(1)**, **notmuch-reply(1)**,
-**notmuch-restore(1)**, **notmuch-search(1)**,
+**notmuch-insert(1)**, **notmuch-new(1)**, **notmuch-reindex(1)**,
+**notmuch-reply(1)**, **notmuch-restore(1)**, **notmuch-search(1)**,
**notmuch-search-terms(7)**, **notmuch-show(1)**, **notmuch-tag(1)**
The notmuch website: **https://notmuchmail.org**
diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst
index 47cab48d..dd76972e 100644
--- a/doc/man7/notmuch-search-terms.rst
+++ b/doc/man7/notmuch-search-terms.rst
@@ -9,6 +9,8 @@ SYNOPSIS
**notmuch** **dump** [--format=(batch-tag|sup)] [--] [--output=<*file*>] [--] [<*search-term*> ...]
+**notmuch** **reindex** [option ...] <*search-term*> ...
+
**notmuch** **search** [option ...] <*search-term*> ...
**notmuch** **show** [option ...] <*search-term*> ...
@@ -421,5 +423,6 @@ SEE ALSO
**notmuch(1)**, **notmuch-config(1)**, **notmuch-count(1)**,
**notmuch-dump(1)**, **notmuch-hooks(5)**, **notmuch-insert(1)**,
-**notmuch-new(1)**, **notmuch-reply(1)**, **notmuch-restore(1)**,
-**notmuch-search(1)**, **notmuch-show(1)**, **notmuch-tag(1)**
+**notmuch-new(1)**, **notmuch-reindex(1)**, **notmuch-reply(1)**,
+**notmuch-restore(1)**, **notmuch-search(1)**, **notmuch-show(1)**,
+**notmuch-tag(1)**
diff --git a/notmuch-client.h b/notmuch-client.h
index a6f70eae..ab7138c6 100644
--- a/notmuch-client.h
+++ b/notmuch-client.h
@@ -196,6 +196,9 @@ int
notmuch_insert_command (notmuch_config_t *config, int argc, char *argv[]);
int
+notmuch_reindex_command (notmuch_config_t *config, int argc, char *argv[]);
+
+int
notmuch_reply_command (notmuch_config_t *config, int argc, char *argv[]);
int
diff --git a/notmuch-reindex.c b/notmuch-reindex.c
new file mode 100644
index 00000000..44223042
--- /dev/null
+++ b/notmuch-reindex.c
@@ -0,0 +1,134 @@
+/* notmuch - Not much of an email program, (just index and search)
+ *
+ * Copyright © 2016 Daniel Kahn Gillmor
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see http://www.gnu.org/licenses/ .
+ *
+ * Author: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
+ */
+
+#include "notmuch-client.h"
+#include "string-util.h"
+
+static volatile sig_atomic_t interrupted;
+
+static void
+handle_sigint (unused (int sig))
+{
+ static char msg[] = "Stopping... \n";
+
+ /* This write is "opportunistic", so it's okay to ignore the
+ * result. It is not required for correctness, and if it does
+ * fail or produce a short write, we want to get out of the signal
+ * handler as quickly as possible, not retry it. */
+ IGNORE_RESULT (write (2, msg, sizeof (msg) - 1));
+ interrupted = 1;
+}
+
+/* reindex all messages matching 'query_string' using the passed-in indexopts
+ */
+static int
+reindex_query (notmuch_database_t *notmuch, const char *query_string,
+ notmuch_param_t *indexopts)
+{
+ notmuch_query_t *query;
+ notmuch_messages_t *messages;
+ notmuch_message_t *message;
+ notmuch_status_t status;
+
+ notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS;
+
+ query = notmuch_query_create (notmuch, query_string);
+ if (query == NULL) {
+ fprintf (stderr, "Out of memory.\n");
+ return 1;
+ }
+
+ /* reindexing is not interested in any special sort order */
+ notmuch_query_set_sort (query, NOTMUCH_SORT_UNSORTED);
+
+ status = notmuch_query_search_messages (query, &messages);
+ if (print_status_query ("notmuch reindex", query, status))
+ return status;
+
+ ret = notmuch_database_begin_atomic (notmuch);
+ for (;
+ notmuch_messages_valid (messages) && ! interrupted;
+ notmuch_messages_move_to_next (messages)) {
+ message = notmuch_messages_get (messages);
+
+ ret = notmuch_message_reindex(message, indexopts);
+ if (ret != NOTMUCH_STATUS_SUCCESS)
+ break;
+ }
+
+ if (!ret)
+ ret = notmuch_database_end_atomic (notmuch);
+
+ notmuch_query_destroy (query);
+
+ return ret || interrupted;
+}
+
+int
+notmuch_reindex_command (notmuch_config_t *config, int argc, char *argv[])
+{
+ char *query_string = NULL;
+ notmuch_database_t *notmuch;
+ struct sigaction action;
+ int opt_index;
+ int ret;
+ notmuch_param_t *indexopts = NULL;
+
+ /* Set up our handler for SIGINT */
+ memset (&action, 0, sizeof (struct sigaction));
+ action.sa_handler = handle_sigint;
+ sigemptyset (&action.sa_mask);
+ action.sa_flags = SA_RESTART;
+ sigaction (SIGINT, &action, NULL);
+
+ notmuch_opt_desc_t options[] = {
+ { NOTMUCH_OPT_INHERIT, (void *) ¬much_shared_options, NULL, 0, 0 },
+ { 0, 0, 0, 0, 0 }
+ };
+
+ opt_index = parse_arguments (argc, argv, options, 1);
+ if (opt_index < 0)
+ return EXIT_FAILURE;
+
+ notmuch_process_shared_options (argv[0]);
+
+ if (notmuch_database_open (notmuch_config_get_database_path (config),
+ NOTMUCH_DATABASE_MODE_READ_WRITE, ¬much))
+ return EXIT_FAILURE;
+
+ notmuch_exit_if_unmatched_db_uuid (notmuch);
+
+ query_string = query_string_from_args (config, argc-opt_index, argv+opt_index);
+ if (query_string == NULL) {
+ fprintf (stderr, "Out of memory\n");
+ return EXIT_FAILURE;
+ }
+
+ if (*query_string == '\0') {
+ fprintf (stderr, "Error: notmuch reindex requires at least one search term.\n");
+ return EXIT_FAILURE;
+ }
+
+ ret = reindex_query (notmuch, query_string, indexopts);
+
+ notmuch_database_destroy (notmuch);
+
+ return ret || interrupted ? EXIT_FAILURE : EXIT_SUCCESS;
+}
diff --git a/notmuch.c b/notmuch.c
index 8e332ce6..201c7454 100644
--- a/notmuch.c
+++ b/notmuch.c
@@ -123,6 +123,8 @@ static command_t commands[] = {
"Restore the tags from the given dump file (see 'dump')." },
{ "compact", notmuch_compact_command, NOTMUCH_CONFIG_OPEN,
"Compact the notmuch database." },
+ { "reindex", notmuch_reindex_command, NOTMUCH_CONFIG_OPEN,
+ "Re-index all messages matching the search terms." },
{ "config", notmuch_config_command, NOTMUCH_CONFIG_OPEN,
"Get or set settings in the notmuch configuration file." },
{ "help", notmuch_help_command, NOTMUCH_CONFIG_CREATE, /* create but don't save config */
diff --git a/performance-test/M04-reindex.sh b/performance-test/M04-reindex.sh
new file mode 100755
index 00000000..d36e061b
--- /dev/null
+++ b/performance-test/M04-reindex.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='reindex'
+
+. ./perf-test-lib.sh || exit 1
+
+memory_start
+
+memory_run 'reindex *' "notmuch reindex '*'"
+
+memory_done
diff --git a/performance-test/T03-reindex.sh b/performance-test/T03-reindex.sh
new file mode 100755
index 00000000..7af2d22d
--- /dev/null
+++ b/performance-test/T03-reindex.sh
@@ -0,0 +1,13 @@
+#!/bin/bash
+
+test_description='tagging'
+
+. ./perf-test-lib.sh || exit 1
+
+time_start
+
+time_run 'reindex *' "notmuch reindex '*'"
+time_run 'reindex *' "notmuch reindex '*'"
+time_run 'reindex *' "notmuch reindex '*'"
+
+time_done
diff --git a/test/T670-duplicate-mid.sh b/test/T670-duplicate-mid.sh
index f1952555..da5ce5d4 100755
--- a/test/T670-duplicate-mid.sh
+++ b/test/T670-duplicate-mid.sh
@@ -23,4 +23,11 @@ EOF
notmuch search --output=files "sekrit" | notmuch_dir_sanitize > OUTPUT
test_expect_equal_file EXPECTED OUTPUT
+rm ${MAIL_DIR}/copy3
+test_begin_subtest 'reindex drops terms in duplicate file'
+cp /dev/null EXPECTED
+notmuch reindex '*'
+notmuch search --output=files "sekrit" | notmuch_dir_sanitize > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
test_done
diff --git a/test/T700-reindex.sh b/test/T700-reindex.sh
new file mode 100755
index 00000000..7f2af9c2
--- /dev/null
+++ b/test/T700-reindex.sh
@@ -0,0 +1,60 @@
+#!/usr/bin/env bash
+test_description='reindexing messages'
+. ./test-lib.sh || exit 1
+
+add_email_corpus
+
+notmuch tag +usertag1 '*'
+
+notmuch search '*' | notmuch_search_sanitize > initial-threads
+notmuch search --output=messages '*' > initial-message-ids
+notmuch dump > initial-dump
+
+test_begin_subtest 'reindex preserves threads'
+notmuch reindex '*'
+notmuch search '*' | notmuch_search_sanitize > OUTPUT
+test_expect_equal_file initial-threads OUTPUT
+
+test_begin_subtest 'reindex after removing duplicate file preserves threads'
+# remove one copy
+sed 's,3/3(4),3/3,' < initial-threads > EXPECTED
+mv $MAIL_DIR/bar/18:2, duplicate-msg-1.eml
+notmuch reindex '*'
+notmuch search '*' | notmuch_search_sanitize > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest 'reindex preserves message-ids'
+notmuch reindex '*'
+notmuch search --output=messages '*' > OUTPUT
+test_expect_equal_file initial-message-ids OUTPUT
+
+test_begin_subtest 'reindex preserves tags'
+notmuch reindex '*'
+notmuch dump > OUTPUT
+test_expect_equal_file initial-dump OUTPUT
+
+test_begin_subtest 'reindex moves a message between threads'
+notmuch search --output=threads id:87iqd9rn3l.fsf@vertex.dottedmag > EXPECTED
+# re-parent
+sed -i 's/1258471718-6781-1-git-send-email-dottedmag@dottedmag.net/87iqd9rn3l.fsf@vertex.dottedmag/' $MAIL_DIR/02:2,*
+notmuch reindex id:1258471718-6781-2-git-send-email-dottedmag@dottedmag.net
+notmuch search --output=threads id:1258471718-6781-2-git-send-email-dottedmag@dottedmag.net > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest 'reindex detects removal of all files'
+notmuch search --output=messages not id:20091117232137.GA7669@griffis1.net> EXPECTED
+# remove both copies
+mv $MAIL_DIR/cur/51:2,* duplicate-message-2.eml
+notmuch reindex id:20091117232137.GA7669@griffis1.net
+notmuch search --output=messages '*' > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
+add_email_corpus lkml
+
+test_begin_subtest "reindex of lkml corpus preserves threads"
+notmuch search '*' | notmuch_search_sanitize > EXPECTED
+notmuch reindex '*'
+notmuch search '*' | notmuch_search_sanitize > OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
+test_done
--
2.11.0
next prev parent reply other threads:[~2017-04-14 3:14 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-14 3:14 index multiple files per message-id, add reindex command David Bremner
2017-04-14 3:14 ` [PATCH 05/10] test: add known broken tests for duplicate message id David Bremner
2017-04-14 3:14 ` [PATCH 08/10] lib: add _notmuch_message_remove_indexed_terms David Bremner
2017-04-14 3:14 ` David Bremner [this message]
2017-04-14 3:14 ` [PATCH 04/10] lib: refactor notmuch_database_add_message header parsing David Bremner
2017-04-14 3:14 ` [PATCH 07/10] WIP: Add message count to summary output David Bremner
2017-04-14 3:14 ` [PATCH 01/10] lib: isolate n_d_add_message and helper functions into own file David Bremner
2017-04-14 3:14 ` [PATCH 09/10] lib: add notmuch_message_reindex David Bremner
2017-04-14 3:14 ` [PATCH 06/10] lib: index message files with duplicate message-ids David Bremner
2017-04-14 3:14 ` [PATCH 03/10] lib: factor out message-id parsing to separate file David Bremner
2017-04-14 3:14 ` [PATCH 02/10] lib/n_d_add_message: refactor test for new/ghost messages David Bremner
-- strict thread matches above, loose matches on Subject: below --
2017-04-19 1:23 v1.1 index multiple files per message-id, add reindex command David Bremner
2017-04-19 1:23 ` [PATCH 10/10] add "notmuch reindex" subcommand David Bremner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170414025004.5334-11-david@tethera.net \
--to=david@tethera.net \
--cc=dkg@fifthhorseman.net \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).