unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] test thread breakage when messages are removed and re-added
@ 2016-03-31 17:34 Daniel Kahn Gillmor
  2016-04-01 22:27 ` Daniel Kahn Gillmor
                   ` (4 more replies)
  0 siblings, 5 replies; 46+ messages in thread
From: Daniel Kahn Gillmor @ 2016-03-31 17:34 UTC (permalink / raw)
  To: Notmuch Mail

This test (T590-thread-breakage.sh) currently fails!

If you have a two-message thread where message "B" is in-reply-to "A",
notmuch rightly sees this as a single thread.

But if you:

 * remove "A" from the message store
 * run "notmuch new"
 * add "A" back into the message store
 * re-run "notmuch new"

Then notmuch sees the messages as distinct threads.

I think this happens because if you insert "B" initially (before
anything is known about "A"), then a "ghost message" gets added to the
database in reference to "A" that is in the same thread, which "A"
takes over when it appears.

But if "A" is subsequently removed, no ghost message is retained, so
when "A" appears, it is treated as a new thread.

I don't know how to easily fix this, but i see a few options:

ghost-on-removal
----------------

We could unilaterally add a ghost upon message removal.  This has a
few disadvantages: the message index would leak information about what
messages the user has ever been exposed to, and we also create a
perpetually-growing dataset -- the ghosts can never be removed.

ghost-on-removal-when-shared-thread-exists
------------------------------------------

We could add a ghost upon message removal iff there are other
non-ghost messages with the same thread ID.

We'd also need to remove all ghost messages that share a thread when
the last non-ghost message in that thread is removed.

This still has a bit of information leakage, though: the message index
would reveal that i've seen a newer message in a thread, even if i had
deleted it from my message store

track-dependencies
------------------

rather than a simple "ghost-message" we could store all the (A,B)
message-reference pairs internally, showing which messages A reference
which other messages B.

Then removal of message X would require deleting all message-reference
pairs (X,B), and only deleting a ghost message if no (A,X) reference
pair exists.

This requires modifying the database by adding a new and fairly weird
table that would need to be indexed by both columns.  I don't know
whether xapian has nice ways to do that.

scan-dependencies
-----------------

Without modifying the database, we could do something less efficient.

Upon removal of message X, we could scan the headers of all non-ghost
messages that share a thread with X.  If any of those messages refers
to X, we would add a ghost message.  If none of them do, then we would
just drop X entirely from the table.
---
 test/T590-thread-breakage.sh | 63 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)
 create mode 100755 test/T590-thread-breakage.sh

diff --git a/test/T590-thread-breakage.sh b/test/T590-thread-breakage.sh
new file mode 100755
index 0000000..704f504
--- /dev/null
+++ b/test/T590-thread-breakage.sh
@@ -0,0 +1,63 @@
+#!/usr/bin/env bash
+#
+# Copyright (c) 2016 Daniel Kahn Gillmor
+#
+
+test_description='thread breakage by reindexing (currently broken)'
+
+. ./test-lib.sh || exit 1
+
+message_a() {
+    mkdir -p ${MAIL_DIR}/cur
+    cat > ${MAIL_DIR}/cur/a <<EOF
+Subject: First message
+Message-ID: <a@example.net>
+From: Alice <alice@example.net>
+To: Bob <bob@example.net>
+Date: Thu, 31 Mar 2016 20:10:00 -0400
+
+This is the first message in the thread.
+EOF
+}
+
+message_b() {
+    mkdir -p ${MAIL_DIR}/cur
+    cat > ${MAIL_DIR}/cur/b <<EOF
+Subject: Second message
+Message-ID: <b@example.net>
+In-Reply-To: <a@example.net>
+References: <a@example.net>
+From: Bob <bob@example.net>
+To: Alice <alice@example.net>
+Date: Thu, 31 Mar 2016 20:15:00 -0400
+
+This is the second message in the thread.
+EOF
+}
+
+
+test_thread_count() {
+    notmuch new >/dev/null
+    test_begin_subtest "${2:-Expecting $1 thread(s)}"
+    count=$(notmuch count --output=threads)
+    test_expect_equal "$count" "$1"
+}
+
+test_thread_count 0 'There should be no threads initially'
+
+message_a
+test_thread_count 1 'One message in: one thread'
+
+message_b
+test_thread_count 1 'Second message in the same thread: one thread'
+
+rm -f ${MAIL_DIR}/cur/a
+test_thread_count 1 'First message removed: still only one thread'
+
+message_a
+# this is known to fail (it shows 2 threads) because no "ghost
+# message" was created for message A when it was removed from the
+# index, despite message B still pointing to it.
+test_thread_count 1 'First message reappears: should return to the same thread'
+
+test_done
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2016-04-20  4:03 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-31 17:34 [PATCH] test thread breakage when messages are removed and re-added Daniel Kahn Gillmor
2016-04-01 22:27 ` Daniel Kahn Gillmor
2016-04-01 23:31 ` [PATCH 1/2] verify during thread-breakage that messages are removed as well Daniel Kahn Gillmor
2016-04-01 23:31   ` [PATCH 2/2] fix thread breakage via ghost-on-removal Daniel Kahn Gillmor
2016-04-02 14:15 ` [PATCH v2 1/7] test thread breakage when messages are removed and re-added Daniel Kahn Gillmor
2016-04-02 14:15   ` [PATCH v2 2/7] verify during thread-breakage that messages are removed as well Daniel Kahn Gillmor
2016-04-06  1:20     ` David Bremner
2016-04-09  1:54       ` Daniel Kahn Gillmor
2016-04-02 14:15   ` [PATCH v2 3/7] fix thread breakage via ghost-on-removal Daniel Kahn Gillmor
2016-04-05  6:53     ` Tomi Ollila
2016-04-05 20:05       ` Daniel Kahn Gillmor
2016-04-05 23:33         ` David Bremner
2016-04-06  1:39     ` David Bremner
2016-04-02 14:15   ` [PATCH v2 4/7] Add internal functions to search for alternate doc types Daniel Kahn Gillmor
2016-04-06  1:52     ` David Bremner
2016-04-02 14:15   ` [PATCH v2 5/7] Introduce _notmuch_message_has_term() Daniel Kahn Gillmor
2016-04-06  2:04     ` David Bremner
2016-04-02 14:15   ` [PATCH v2 6/7] On deletion, replace with ghost when other active messages in thread Daniel Kahn Gillmor
2016-04-02 14:15   ` [PATCH v2 7/7] complete ghost-on-removal-when-shared-thread-exists Daniel Kahn Gillmor
2016-04-02 16:19   ` [PATCH 1/2] test thread breakage when messages are removed and re-added David Bremner
2016-04-02 16:19     ` [PATCH 2/2] test: add test-binary to print the number of ghost messages David Bremner
2016-04-09  1:02 ` [PATCH v3 1/7] " Daniel Kahn Gillmor
2016-04-09  1:02   ` [PATCH v3 2/7] test thread breakage when messages are removed and re-added Daniel Kahn Gillmor
2016-04-09  1:02   ` [PATCH v3 3/7] fix thread breakage via ghost-on-removal Daniel Kahn Gillmor
2016-04-09  1:02   ` [PATCH v3 4/7] Add internal functions to search for alternate doc types Daniel Kahn Gillmor
2016-04-09  1:02   ` [PATCH v3 5/7] Introduce _notmuch_message_has_term() Daniel Kahn Gillmor
2016-04-09  1:02   ` [PATCH v3 6/7] On deletion, replace with ghost when other active messages in thread Daniel Kahn Gillmor
2016-04-09  1:02   ` [PATCH v3 7/7] complete ghost-on-removal-when-shared-thread-exists Daniel Kahn Gillmor
2016-04-09  1:54 ` [PATCH v4 1/7] test: add test-binary to print the number of ghost messages Daniel Kahn Gillmor
2016-04-09  1:54   ` [PATCH v4 2/7] test thread breakage when messages are removed and re-added Daniel Kahn Gillmor
2016-04-11 13:59     ` [PATCH] remove debugging spew from T590 Daniel Kahn Gillmor
2016-04-09  1:54   ` [PATCH v4 3/7] fix thread breakage via ghost-on-removal Daniel Kahn Gillmor
2016-04-09  1:54   ` [PATCH v4 4/7] Add internal functions to search for alternate doc types Daniel Kahn Gillmor
2016-04-09  1:54   ` [PATCH v4 5/7] Introduce _notmuch_message_has_term() Daniel Kahn Gillmor
2016-04-09  1:54   ` [PATCH v4 6/7] On deletion, replace with ghost when other active messages in thread Daniel Kahn Gillmor
2016-04-09  1:54   ` [PATCH v4 7/7] complete ghost-on-removal-when-shared-thread-exists Daniel Kahn Gillmor
2016-04-09 11:31     ` David Bremner
2016-04-09 18:55       ` Daniel Kahn Gillmor
2016-04-09 19:15         ` David Bremner
2016-04-10  8:35     ` Tomi Ollila
2016-04-11  0:33     ` David Bremner
2016-04-11 19:18       ` Daniel Kahn Gillmor
2016-04-12  1:28         ` David Bremner
2016-04-15 10:29           ` David Bremner
2016-04-20  3:36         ` Austin Clements
2016-04-09 11:02   ` [PATCH v4 1/7] test: add test-binary to print the number of ghost messages David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).