unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: David Bremner <david@tethera.net>
To: notmuch@notmuchmail.org
Cc: David Bremner <david@tethera.net>
Subject: [PATCH 4/5] lib: autocommit after some number of completed transactions
Date: Fri, 21 May 2021 22:20:39 -0300	[thread overview]
Message-ID: <20210522012040.1412467-5-david@tethera.net> (raw)
In-Reply-To: <20210522012040.1412467-1-david@tethera.net>

This change addresses two known issues with large sets of changes to
the database.  The first is that as reported by Steven Allen [1],
notmuch commits are not "flushed" when they complete, which means that
if there is an open transaction when the database closes (or e.g. the
program crashes) then all changes since the last commit will be
discarded (nothing is irrecoverably lost for "notmuch new", as the
indexing process just restarts next time it is run).  This does not
really "fix" the issue reported in [1]; that seems rather difficult
given how transactions work in Xapian. On the other hand, with the
default settings, this should mean one only loses less than a minutes
worth of work.  The second issue is the occasionally reported "storm"
of disk writes when notmuch finishes. I don't yet have a test for
this, but I think committing as we go should reduce the amount of work
when finalizing the database.

[1]: id:20151025210215.GA3754@stebalien.com
---
 lib/database-private.h    |  5 +++++
 lib/database.cc           | 18 +++++++++++++-----
 lib/open.cc               | 12 ++++++++++++
 test/T385-transactions.sh |  1 -
 4 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/lib/database-private.h b/lib/database-private.h
index 1a73dacc..9706c17e 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -212,6 +212,11 @@ struct _notmuch_database {
     char thread_id_str[17];
     uint64_t last_thread_id;
 
+    /* How many transactions have successfully completed since we last committed */
+    int transaction_count;
+    /* when to commit and reset the counter */
+    int transaction_threshold;
+
     /* error reporting; this value persists only until the
      * next library call. May be NULL */
     char *status_string;
diff --git a/lib/database.cc b/lib/database.cc
index db7feca2..dafea4ce 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1125,13 +1125,21 @@ notmuch_database_end_atomic (notmuch_database_t *notmuch)
     db = notmuch->writable_xapian_db;
     try {
 	db->commit_transaction ();
-
-	/* This is a hack for testing.  Xapian never flushes on a
-	 * non-flushed commit, even if the flush threshold is 1.
-	 * However, we rely on flushing to test atomicity. */
+	notmuch->transaction_count++;
+
+	/* Xapian never flushes on a non-flushed commit, even if the
+	 * flush threshold is 1.  However, we rely on flushing to test
+	 * atomicity. On the other hand, we can't straight replace
+	 * XAPIAN_FLUSH_THRESHOLD with our autocommit counter, because
+	 * the former also applies outside notmuch atomic
+	 * commits. Hence the follow complicated  test */
 	const char *thresh = getenv ("XAPIAN_FLUSH_THRESHOLD");
-	if (thresh && atoi (thresh) == 1)
+	if ((notmuch->transaction_threshold > 0 &&
+	     notmuch->transaction_count >= notmuch->transaction_threshold) ||
+	    (thresh && atoi (thresh) == 1)) {
 	    db->commit ();
+	    notmuch->transaction_count = 0;
+	}
     } catch (const Xapian::Error &error) {
 	_notmuch_database_log (notmuch, "A Xapian exception occurred committing transaction: %s.\n",
 			       error.get_msg ().c_str ());
diff --git a/lib/open.cc b/lib/open.cc
index 1ca69665..706a23e1 100644
--- a/lib/open.cc
+++ b/lib/open.cc
@@ -256,6 +256,8 @@ _alloc_notmuch ()
     notmuch->writable_xapian_db = NULL;
     notmuch->config_path = NULL;
     notmuch->atomic_nesting = 0;
+    notmuch->transaction_count = 0;
+    notmuch->transaction_threshold = 0;
     notmuch->view = 1;
     return notmuch;
 }
@@ -365,6 +367,8 @@ _finish_open (notmuch_database_t *notmuch,
     notmuch_status_t status = NOTMUCH_STATUS_SUCCESS;
     char *incompat_features;
     char *message = NULL;
+    const char *autocommit_str;
+    char *autocommit_end;
     unsigned int version;
     const char *database_path = notmuch_database_get_path (notmuch);
 
@@ -461,6 +465,14 @@ _finish_open (notmuch_database_t *notmuch,
 	if (status)
 	    goto DONE;
 
+	autocommit_str = notmuch_config_get (notmuch, NOTMUCH_CONFIG_AUTOCOMMIT);
+	if (unlikely (!autocommit_str)) {
+	    INTERNAL_ERROR ("missing configuration for autocommit");
+	}
+	notmuch->transaction_threshold = strtoul (autocommit_str, &autocommit_end, 10);
+	if (*autocommit_end != '\0')
+	    INTERNAL_ERROR ("Malformed database database.autocommit value: %s", autocommit_str);
+
 	status = _notmuch_database_setup_standard_query_fields (notmuch);
 	if (status)
 	    goto DONE;
diff --git a/test/T385-transactions.sh b/test/T385-transactions.sh
index ebfec2ed..d8bb502d 100755
--- a/test/T385-transactions.sh
+++ b/test/T385-transactions.sh
@@ -26,7 +26,6 @@ EOF
 test_expect_equal_file EXPECTED OUTPUT
 
 test_begin_subtest "Some changes saved with open transaction"
-test_subtest_known_broken
 notmuch config set database.autocommit 1000
 rm -r ${MAIL_DIR}/.notmuch
 notmuch_with_shim no-close new
-- 
2.30.2

  parent reply	other threads:[~2021-05-22  1:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-22  1:20 Commit after some number of transactions David Bremner
2021-05-22  1:20 ` [PATCH 1/5] database/close: remove misleading code / comment David Bremner
2021-05-22  1:20 ` [PATCH 2/5] lib/config: add NOTMUCH_CONFIG_AUTOCOMMIT David Bremner
2021-05-22  1:20 ` [PATCH 3/5] test: add known broken test for closing with open transaction David Bremner
2021-05-22  1:20 ` David Bremner [this message]
2021-05-22  1:20 ` [PATCH 5/5] doc: document database.autocommit variable David Bremner
2021-05-22 11:10   ` [PATCH] lib: update transaction documentation David Bremner
2021-05-24 17:14 ` Commit after some number of transactions David Bremner
2021-06-27 17:11 ` David Bremner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210522012040.1412467-5-david@tethera.net \
    --to=david@tethera.net \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).