* [PATCH 1/5] database/close: remove misleading code / comment
2021-05-22 1:20 Commit after some number of transactions David Bremner
@ 2021-05-22 1:20 ` David Bremner
2021-05-22 1:20 ` [PATCH 2/5] lib/config: add NOTMUCH_CONFIG_AUTOCOMMIT David Bremner
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: David Bremner @ 2021-05-22 1:20 UTC (permalink / raw)
To: notmuch; +Cc: David Bremner
Unfortunately, it doesn't make a difference if we call
cancel_transaction or not, all uncommited changes are discarded if
there is an open (unflushed) transaction.
---
lib/database.cc | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/lib/database.cc b/lib/database.cc
index 96458f6f..db7feca2 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -502,17 +502,9 @@ notmuch_database_close (notmuch_database_t *notmuch)
* close it. Thus, we explicitly close it here. */
if (notmuch->open) {
try {
- /* If there's an outstanding transaction, it's unclear if
- * closing the Xapian database commits everything up to
- * that transaction, or may discard committed (but
- * unflushed) transactions. To be certain, explicitly
- * cancel any outstanding transaction before closing. */
- if (_notmuch_database_mode (notmuch) == NOTMUCH_DATABASE_MODE_READ_WRITE &&
- notmuch->atomic_nesting)
- notmuch->writable_xapian_db->cancel_transaction ();
-
/* Close the database. This implicitly flushes
- * outstanding changes. */
+ * outstanding changes. If there is an open (non-flushed)
+ * transaction, ALL pending changes will be discarded */
notmuch->xapian_db->close ();
} catch (const Xapian::Error &error) {
status = NOTMUCH_STATUS_XAPIAN_EXCEPTION;
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/5] lib/config: add NOTMUCH_CONFIG_AUTOCOMMIT
2021-05-22 1:20 Commit after some number of transactions David Bremner
2021-05-22 1:20 ` [PATCH 1/5] database/close: remove misleading code / comment David Bremner
@ 2021-05-22 1:20 ` David Bremner
2021-05-22 1:20 ` [PATCH 3/5] test: add known broken test for closing with open transaction David Bremner
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: David Bremner @ 2021-05-22 1:20 UTC (permalink / raw)
To: notmuch; +Cc: David Bremner
This will be used to control how often atomic transactions are
committed.
---
lib/config.cc | 4 ++++
lib/notmuch.h | 1 +
test/T030-config.sh | 1 +
test/T055-path-config.sh | 1 +
test/T590-libconfig.sh | 4 ++++
5 files changed, 11 insertions(+)
diff --git a/lib/config.cc b/lib/config.cc
index 0ec66372..13eab5f1 100644
--- a/lib/config.cc
+++ b/lib/config.cc
@@ -593,6 +593,8 @@ _notmuch_config_key_to_string (notmuch_config_key_t key)
return "user.other_email";
case NOTMUCH_CONFIG_USER_NAME:
return "user.name";
+ case NOTMUCH_CONFIG_AUTOCOMMIT:
+ return "database.autocommit";
default:
return NULL;
}
@@ -638,6 +640,8 @@ _notmuch_config_default (notmuch_database_t *notmuch, notmuch_config_key_t key)
return email;
case NOTMUCH_CONFIG_NEW_IGNORE:
return "";
+ case NOTMUCH_CONFIG_AUTOCOMMIT:
+ return "8000";
case NOTMUCH_CONFIG_HOOK_DIR:
case NOTMUCH_CONFIG_BACKUP_DIR:
case NOTMUCH_CONFIG_OTHER_EMAIL:
diff --git a/lib/notmuch.h b/lib/notmuch.h
index 4b053932..5c3be342 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -2520,6 +2520,7 @@ typedef enum _notmuch_config_key {
NOTMUCH_CONFIG_PRIMARY_EMAIL,
NOTMUCH_CONFIG_OTHER_EMAIL,
NOTMUCH_CONFIG_USER_NAME,
+ NOTMUCH_CONFIG_AUTOCOMMIT,
NOTMUCH_CONFIG_LAST
} notmuch_config_key_t;
diff --git a/test/T030-config.sh b/test/T030-config.sh
index 7a1660e9..751feaf3 100755
--- a/test/T030-config.sh
+++ b/test/T030-config.sh
@@ -51,6 +51,7 @@ cat <<EOF > EXPECTED
built_with.compact=something
built_with.field_processor=something
built_with.retry_lock=something
+database.autocommit=8000
database.mail_root=MAIL_DIR
database.path=MAIL_DIR
foo.list=this;is another;list value;
diff --git a/test/T055-path-config.sh b/test/T055-path-config.sh
index 8ef76aed..bb3bf665 100755
--- a/test/T055-path-config.sh
+++ b/test/T055-path-config.sh
@@ -259,6 +259,7 @@ EOF
built_with.compact=true
built_with.field_processor=true
built_with.retry_lock=true
+database.autocommit=8000
database.backup_dir
database.hook_dir
database.mail_root=MAIL_DIR
diff --git a/test/T590-libconfig.sh b/test/T590-libconfig.sh
index 745e1bb4..d922c3af 100755
--- a/test/T590-libconfig.sh
+++ b/test/T590-libconfig.sh
@@ -400,6 +400,7 @@ true
USERNAME@FQDN
NULL
USER_FULL_NAME
+8000
== stderr ==
EOF
unset MAILDIR
@@ -711,6 +712,7 @@ true
test_suite@notmuchmail.org
test_suite_other@notmuchmail.org;test_suite@otherdomain.org
Notmuch Test Suite
+8000
== stderr ==
EOF
test_expect_equal_file EXPECTED OUTPUT
@@ -742,6 +744,7 @@ true
USERNAME@FQDN
NULL
USER_FULL_NAME
+8000
== stderr ==
EOF
test_expect_equal_file EXPECTED OUTPUT.clean
@@ -808,6 +811,7 @@ EOF
cat <<'EOF' >EXPECTED
== stdout ==
aaabefore beforeval
+database.autocommit 8000
database.backup_dir MAIL_DIR/.notmuch/backups
database.hook_dir MAIL_DIR/.notmuch/hooks
database.mail_root MAIL_DIR
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/5] test: add known broken test for closing with open transaction
2021-05-22 1:20 Commit after some number of transactions David Bremner
2021-05-22 1:20 ` [PATCH 1/5] database/close: remove misleading code / comment David Bremner
2021-05-22 1:20 ` [PATCH 2/5] lib/config: add NOTMUCH_CONFIG_AUTOCOMMIT David Bremner
@ 2021-05-22 1:20 ` David Bremner
2021-05-22 1:20 ` [PATCH 4/5] lib: autocommit after some number of completed transactions David Bremner
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: David Bremner @ 2021-05-22 1:20 UTC (permalink / raw)
To: notmuch; +Cc: David Bremner
The expected output may need adjusting, but what is clear is that
saving none of the changes is not desirable.
---
test/T385-transactions.sh | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
create mode 100755 test/T385-transactions.sh
diff --git a/test/T385-transactions.sh b/test/T385-transactions.sh
new file mode 100755
index 00000000..ebfec2ed
--- /dev/null
+++ b/test/T385-transactions.sh
@@ -0,0 +1,36 @@
+#!/usr/bin/env bash
+test_description='transactions'
+. $(dirname "$0")/test-lib.sh || exit 1
+
+make_shim no-close <<EOF
+#include <notmuch.h>
+#include <stdio.h>
+notmuch_status_t
+notmuch_database_close (notmuch_database_t *notmuch)
+{
+ return notmuch_database_begin_atomic (notmuch);
+}
+EOF
+
+for i in `seq 1 1024`
+do
+ generate_message '[subject]="'"subject $i"'"' \
+ '[body]="'"body $i"'"'
+done
+
+test_begin_subtest "initial new"
+NOTMUCH_NEW > OUTPUT
+cat <<EOF > EXPECTED
+Added 1024 new messages to the database.
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest "Some changes saved with open transaction"
+test_subtest_known_broken
+notmuch config set database.autocommit 1000
+rm -r ${MAIL_DIR}/.notmuch
+notmuch_with_shim no-close new
+output=$(notmuch count '*')
+test_expect_equal "$output" "1000"
+
+test_done
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/5] lib: autocommit after some number of completed transactions
2021-05-22 1:20 Commit after some number of transactions David Bremner
` (2 preceding siblings ...)
2021-05-22 1:20 ` [PATCH 3/5] test: add known broken test for closing with open transaction David Bremner
@ 2021-05-22 1:20 ` David Bremner
2021-05-22 1:20 ` [PATCH 5/5] doc: document database.autocommit variable David Bremner
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: David Bremner @ 2021-05-22 1:20 UTC (permalink / raw)
To: notmuch; +Cc: David Bremner
This change addresses two known issues with large sets of changes to
the database. The first is that as reported by Steven Allen [1],
notmuch commits are not "flushed" when they complete, which means that
if there is an open transaction when the database closes (or e.g. the
program crashes) then all changes since the last commit will be
discarded (nothing is irrecoverably lost for "notmuch new", as the
indexing process just restarts next time it is run). This does not
really "fix" the issue reported in [1]; that seems rather difficult
given how transactions work in Xapian. On the other hand, with the
default settings, this should mean one only loses less than a minutes
worth of work. The second issue is the occasionally reported "storm"
of disk writes when notmuch finishes. I don't yet have a test for
this, but I think committing as we go should reduce the amount of work
when finalizing the database.
[1]: id:20151025210215.GA3754@stebalien.com
---
lib/database-private.h | 5 +++++
lib/database.cc | 18 +++++++++++++-----
lib/open.cc | 12 ++++++++++++
test/T385-transactions.sh | 1 -
4 files changed, 30 insertions(+), 6 deletions(-)
diff --git a/lib/database-private.h b/lib/database-private.h
index 1a73dacc..9706c17e 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -212,6 +212,11 @@ struct _notmuch_database {
char thread_id_str[17];
uint64_t last_thread_id;
+ /* How many transactions have successfully completed since we last committed */
+ int transaction_count;
+ /* when to commit and reset the counter */
+ int transaction_threshold;
+
/* error reporting; this value persists only until the
* next library call. May be NULL */
char *status_string;
diff --git a/lib/database.cc b/lib/database.cc
index db7feca2..dafea4ce 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1125,13 +1125,21 @@ notmuch_database_end_atomic (notmuch_database_t *notmuch)
db = notmuch->writable_xapian_db;
try {
db->commit_transaction ();
-
- /* This is a hack for testing. Xapian never flushes on a
- * non-flushed commit, even if the flush threshold is 1.
- * However, we rely on flushing to test atomicity. */
+ notmuch->transaction_count++;
+
+ /* Xapian never flushes on a non-flushed commit, even if the
+ * flush threshold is 1. However, we rely on flushing to test
+ * atomicity. On the other hand, we can't straight replace
+ * XAPIAN_FLUSH_THRESHOLD with our autocommit counter, because
+ * the former also applies outside notmuch atomic
+ * commits. Hence the follow complicated test */
const char *thresh = getenv ("XAPIAN_FLUSH_THRESHOLD");
- if (thresh && atoi (thresh) == 1)
+ if ((notmuch->transaction_threshold > 0 &&
+ notmuch->transaction_count >= notmuch->transaction_threshold) ||
+ (thresh && atoi (thresh) == 1)) {
db->commit ();
+ notmuch->transaction_count = 0;
+ }
} catch (const Xapian::Error &error) {
_notmuch_database_log (notmuch, "A Xapian exception occurred committing transaction: %s.\n",
error.get_msg ().c_str ());
diff --git a/lib/open.cc b/lib/open.cc
index 1ca69665..706a23e1 100644
--- a/lib/open.cc
+++ b/lib/open.cc
@@ -256,6 +256,8 @@ _alloc_notmuch ()
notmuch->writable_xapian_db = NULL;
notmuch->config_path = NULL;
notmuch->atomic_nesting = 0;
+ notmuch->transaction_count = 0;
+ notmuch->transaction_threshold = 0;
notmuch->view = 1;
return notmuch;
}
@@ -365,6 +367,8 @@ _finish_open (notmuch_database_t *notmuch,
notmuch_status_t status = NOTMUCH_STATUS_SUCCESS;
char *incompat_features;
char *message = NULL;
+ const char *autocommit_str;
+ char *autocommit_end;
unsigned int version;
const char *database_path = notmuch_database_get_path (notmuch);
@@ -461,6 +465,14 @@ _finish_open (notmuch_database_t *notmuch,
if (status)
goto DONE;
+ autocommit_str = notmuch_config_get (notmuch, NOTMUCH_CONFIG_AUTOCOMMIT);
+ if (unlikely (!autocommit_str)) {
+ INTERNAL_ERROR ("missing configuration for autocommit");
+ }
+ notmuch->transaction_threshold = strtoul (autocommit_str, &autocommit_end, 10);
+ if (*autocommit_end != '\0')
+ INTERNAL_ERROR ("Malformed database database.autocommit value: %s", autocommit_str);
+
status = _notmuch_database_setup_standard_query_fields (notmuch);
if (status)
goto DONE;
diff --git a/test/T385-transactions.sh b/test/T385-transactions.sh
index ebfec2ed..d8bb502d 100755
--- a/test/T385-transactions.sh
+++ b/test/T385-transactions.sh
@@ -26,7 +26,6 @@ EOF
test_expect_equal_file EXPECTED OUTPUT
test_begin_subtest "Some changes saved with open transaction"
-test_subtest_known_broken
notmuch config set database.autocommit 1000
rm -r ${MAIL_DIR}/.notmuch
notmuch_with_shim no-close new
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/5] doc: document database.autocommit variable
2021-05-22 1:20 Commit after some number of transactions David Bremner
` (3 preceding siblings ...)
2021-05-22 1:20 ` [PATCH 4/5] lib: autocommit after some number of completed transactions David Bremner
@ 2021-05-22 1:20 ` David Bremner
2021-05-22 11:10 ` [PATCH] lib: update transaction documentation David Bremner
2021-05-24 17:14 ` Commit after some number of transactions David Bremner
2021-06-27 17:11 ` David Bremner
6 siblings, 1 reply; 9+ messages in thread
From: David Bremner @ 2021-05-22 1:20 UTC (permalink / raw)
To: notmuch; +Cc: David Bremner
This exposes some database internals that most users will probably not
understand.
---
doc/man1/notmuch-config.rst | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/doc/man1/notmuch-config.rst b/doc/man1/notmuch-config.rst
index 75c59ff9..7d709bfb 100644
--- a/doc/man1/notmuch-config.rst
+++ b/doc/man1/notmuch-config.rst
@@ -78,6 +78,16 @@ paths are presumed relative to `$HOME` for items in section
Directory containing hooks run by notmuch commands. See
**notmuch-hooks(5)**.
+ History: this configuration value was introduced in notmuch 0.32.
+
+**database.autocommit**
+
+ How often to commit transactions to disk. `0` means wait until
+ command completes, otherwise an integer `n` specifies to commit to
+ disk after every `n` completed transactions.
+
+ History: this configuration value was introduced in notmuch 0.33.
+
**user.name**
Your full name.
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH] lib: update transaction documentation
2021-05-22 1:20 ` [PATCH 5/5] doc: document database.autocommit variable David Bremner
@ 2021-05-22 11:10 ` David Bremner
0 siblings, 0 replies; 9+ messages in thread
From: David Bremner @ 2021-05-22 11:10 UTC (permalink / raw)
To: David Bremner, notmuch
Partly this is to recognize the semantics we inherit from Xapian,
partly to mention the new autocommit feature.
---
lib/notmuch.h | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/lib/notmuch.h b/lib/notmuch.h
index 5c3be342..3b28bea3 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -529,11 +529,11 @@ notmuch_database_status_string (const notmuch_database_t *notmuch);
* have no effect.
*
* For writable databases, notmuch_database_close commits all changes
- * to disk before closing the database. If the caller is currently in
- * an atomic section (there was a notmuch_database_begin_atomic
- * without a matching notmuch_database_end_atomic), this will discard
- * changes made in that atomic section (but still commit changes made
- * prior to entering the atomic section).
+ * to disk before closing the database, unless the caller is currently
+ * in an atomic section (there was a notmuch_database_begin_atomic
+ * without a matching notmuch_database_end_atomic). In this case
+ * changes since the last commit are discarded. @see
+ * notmuch_database_end_atomic for more information.
*
* Return value:
*
@@ -670,7 +670,10 @@ notmuch_status_t
notmuch_database_begin_atomic (notmuch_database_t *notmuch);
/**
- * Indicate the end of an atomic database operation.
+ * Indicate the end of an atomic database operation. If repeated
+ * (with matching notmuch_database_begin_atomic) "database.autocommit"
+ * times, commit the the transaction and all previous (non-cancelled)
+ * transactions to the database.
*
* Return value:
*
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Commit after some number of transactions
2021-05-22 1:20 Commit after some number of transactions David Bremner
` (4 preceding siblings ...)
2021-05-22 1:20 ` [PATCH 5/5] doc: document database.autocommit variable David Bremner
@ 2021-05-24 17:14 ` David Bremner
2021-06-27 17:11 ` David Bremner
6 siblings, 0 replies; 9+ messages in thread
From: David Bremner @ 2021-05-24 17:14 UTC (permalink / raw)
To: notmuch
David Bremner <david@tethera.net> writes:
> The main rational is explained in the commit message to
>
> [PATCH 4/5] lib: autocommit after some number of completed
>
> I'm not super-happy with the documentation in [5/5], as it explains
> things in terms of database concepts the user shouldn't really need to
> understand.
>
> [PATCH 5/5] doc: document database.autocommit variable
>
> The default value of 8000 was chose not to cause any noticable
> slowdown when indexing the "large" corpus of about 200k messages. The
> test machine is a recent Xeon with fast spinning rust drives; the
> whole index takes about 8.5 minutes on this machine. I'd be curious if
> other people notice a performance impact. On the same machine the
> threshold of 8000 means that less than 30 seconds worth of work would
> be discarded.
An alternate approach would be to expose Xapian's commit() method as
notmuch_database_commit(). That would not really conflict with this
series, but arguably makes it unnecessary, if the clients (e.g. the
notmuch CLI) add appropriate calls to notmuch_database_commit.
I'm not 100% sure, but I think I may want to add notmuch_database_commit
in any case, for some other cases where automatic committing doesn't
work so well.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Commit after some number of transactions
2021-05-22 1:20 Commit after some number of transactions David Bremner
` (5 preceding siblings ...)
2021-05-24 17:14 ` Commit after some number of transactions David Bremner
@ 2021-06-27 17:11 ` David Bremner
6 siblings, 0 replies; 9+ messages in thread
From: David Bremner @ 2021-06-27 17:11 UTC (permalink / raw)
To: notmuch
David Bremner <david@tethera.net> writes:
> The main rational is explained in the commit message to
>
> [PATCH 4/5] lib: autocommit after some number of completed
>
> I'm not super-happy with the documentation in [5/5], as it explains
> things in terms of database concepts the user shouldn't really need to
> understand.
>
I have applied the series to master. Documentation improvements welcome.
d
^ permalink raw reply [flat|nested] 9+ messages in thread