* [PATCH 1/2] Add test for incorrect threading of messages
2014-05-26 14:05 [PATCH 0/2] Improve message threading Michal Sojka
@ 2014-05-26 14:05 ` Michal Sojka
2014-05-26 14:05 ` [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone Michal Sojka
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Michal Sojka @ 2014-05-26 14:05 UTC (permalink / raw)
To: notmuch
This happens when there is some garbage after the last Message-ID in
the References header. See for example
https://lkml.org/lkml/headers/2014/5/19/864.
---
test/T510-thread-replies.sh | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/test/T510-thread-replies.sh b/test/T510-thread-replies.sh
index eeb70d0..d818b89 100755
--- a/test/T510-thread-replies.sh
+++ b/test/T510-thread-replies.sh
@@ -137,5 +137,33 @@ expected='[[[{"id": "foo@four.com", "match": true, "excluded": false,
expected=`echo "$expected" | notmuch_json_show_sanitize`
test_expect_equal_json "$output" "$expected"
+test_begin_subtest "Ignore garbage at the end of References"
+test_subtest_known_broken
+add_message '[id]="foo@five.com"' \
+ '[subject]="five"'
+add_message '[id]="bar@five.com"' \
+ '[references]="<foo@five.com> (garbage)"' \
+ '[subject]="not-five"'
+output=$(notmuch show --format=json 'subject:five' | notmuch_json_show_sanitize)
+expected='[[[{"id": "XXXXX", "match": true, "excluded": false,
+ "filename": "YYYYY", "timestamp": 42, "date_relative": "2001-01-05",
+ "tags": ["inbox", "unread"], "headers": {"Subject": "five",
+ "From": "Notmuch Test Suite <test_suite@notmuchmail.org>",
+ "To": "Notmuch Test Suite <test_suite@notmuchmail.org>",
+ "Date": "GENERATED_DATE"}, "body": [{"id": 1,
+ "content-type": "text/plain",
+ "content": "This is just a test message (#10)\n"}]},
+ [[{"id": "XXXXX", "match": true, "excluded": false,
+ "filename": "YYYYY", "timestamp": 42, "date_relative": "2001-01-05",
+ "tags": ["inbox", "unread"],
+ "headers": {"Subject": "not-five",
+ "From": "Notmuch Test Suite <test_suite@notmuchmail.org>",
+ "To": "Notmuch Test Suite <test_suite@notmuchmail.org>",
+ "Date": "GENERATED_DATE"},
+ "body": [{"id": 1, "content-type": "text/plain",
+ "content": "This is just a test message (#11)\n"}]}, []]]]]]'
+expected=`echo "$expected" | notmuch_json_show_sanitize`
+test_expect_equal_json "$output" "$expected"
+
test_done
--
2.0.0.rc2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone
2014-05-26 14:05 [PATCH 0/2] Improve message threading Michal Sojka
2014-05-26 14:05 ` [PATCH 1/2] Add test for incorrect threading of messages Michal Sojka
@ 2014-05-26 14:05 ` Michal Sojka
2014-07-13 14:54 ` David Bremner
2014-08-10 19:49 ` [PATCH 0/2] Improve message threading Tomi Ollila
2014-08-17 0:49 ` David Bremner
3 siblings, 1 reply; 6+ messages in thread
From: Michal Sojka @ 2014-05-26 14:05 UTC (permalink / raw)
To: notmuch
According to RFC2822 References and In-Reply-To headers are supposed
to contain one or more Message-IDs, however older RFC822 allowed
almost any content. When both References and In-Reply-To headers ends
with something else that a Message-ID (see e.g. [1]), the thread
structure presented by notmuch is incorrect. The reason is that
notmuch treats this case as if the email contained no "replyto"
information (see _notmuch_database_link_message_to_parents).
This patch changes the parse_references() function to return the last
valid Message-ID encountered rather than NULL resulting from the last
hunk of text not being the Message-ID.
[1] https://lkml.org/lkml/headers/2014/5/19/864
---
lib/database.cc | 15 ++++++---------
test/T510-thread-replies.sh | 1 -
2 files changed, 6 insertions(+), 10 deletions(-)
diff --git a/lib/database.cc b/lib/database.cc
index 1efb14d..26f15fd 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -521,7 +521,7 @@ parse_references (void *ctx,
GHashTable *hash,
const char *refs)
{
- char *ref;
+ char *ref, *last_ref = NULL;
if (refs == NULL || *refs == '\0')
return NULL;
@@ -529,20 +529,17 @@ parse_references (void *ctx,
while (*refs) {
ref = _parse_message_id (ctx, refs, &refs);
- if (ref && strcmp (ref, message_id))
+ if (ref && strcmp (ref, message_id)) {
g_hash_table_insert (hash, ref, NULL);
+ last_ref = ref;
+ }
}
/* The return value of this function is used to add a parent
* reference to the database. We should avoid making a message
- * its own parent, thus the following check.
+ * its own parent, thus the above check.
*/
-
- if (ref && strcmp(ref, message_id)) {
- return ref;
- } else {
- return NULL;
- }
+ return last_ref;
}
notmuch_status_t
diff --git a/test/T510-thread-replies.sh b/test/T510-thread-replies.sh
index d818b89..1392fbe 100755
--- a/test/T510-thread-replies.sh
+++ b/test/T510-thread-replies.sh
@@ -138,7 +138,6 @@ expected=`echo "$expected" | notmuch_json_show_sanitize`
test_expect_equal_json "$output" "$expected"
test_begin_subtest "Ignore garbage at the end of References"
-test_subtest_known_broken
add_message '[id]="foo@five.com"' \
'[subject]="five"'
add_message '[id]="bar@five.com"' \
--
2.0.0.rc2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone
2014-05-26 14:05 ` [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone Michal Sojka
@ 2014-07-13 14:54 ` David Bremner
0 siblings, 0 replies; 6+ messages in thread
From: David Bremner @ 2014-07-13 14:54 UTC (permalink / raw)
To: Michal Sojka, notmuch
Michal Sojka <sojkam1@fel.cvut.cz> writes:
> According to RFC2822 References and In-Reply-To headers are supposed
> to contain one or more Message-IDs, however older RFC822 allowed
> almost any content. When both References and In-Reply-To headers ends
> with something else that a Message-ID (see e.g. [1]), the thread
> structure presented by notmuch is incorrect. The reason is that
> notmuch treats this case as if the email contained no "replyto"
> information (see _notmuch_database_link_message_to_parents).
>
> This patch changes the parse_references() function to return the last
> valid Message-ID encountered rather than NULL resulting from the last
> hunk of text not being the Message-ID.
This series looks OK to me. It does touch something pretty fundamental,
so I'd appreciate a second set of eyes on it.
d
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] Improve message threading
2014-05-26 14:05 [PATCH 0/2] Improve message threading Michal Sojka
2014-05-26 14:05 ` [PATCH 1/2] Add test for incorrect threading of messages Michal Sojka
2014-05-26 14:05 ` [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone Michal Sojka
@ 2014-08-10 19:49 ` Tomi Ollila
2014-08-17 0:49 ` David Bremner
3 siblings, 0 replies; 6+ messages in thread
From: Tomi Ollila @ 2014-08-10 19:49 UTC (permalink / raw)
To: Michal Sojka, notmuch
On Mon, May 26 2014, Michal Sojka <sojkam1@fel.cvut.cz> wrote:
> Hi all,
>
> I noticed that in certain situations notmuch does not reconstruct
> thread structure correctly. The fix was quite trivial - see the
> patches.
>
> The patch fixes my problem, however, as I learnt from the thread
> around id:87oazwjq1e.fsf@yoom.home.cworth.org, there are more problems
> in this code. I think that my problem is orthogonal to what was
> discussed there.
>
> The patch does not break any existing text (except two in
> T460-emacs-tree.sh, which fail even with 0.18 on my system).
>
> Cheers,
> -Michal
This series looks good to me, and patch 2/2 fixes the borken test
introduced in patch 1/2 -- i.e. the fix is clear, test is somewhat
tedious to be parsed by a human ;D
Tomi
>
> Michal Sojka (2):
> Add test for incorrect threading of messages
> Make parsing of References and In-Reply-To header less error prone
>
> lib/database.cc | 15 ++++++---------
> test/T510-thread-replies.sh | 27 +++++++++++++++++++++++++++
> 2 files changed, 33 insertions(+), 9 deletions(-)
>
> --
> 2.0.0.rc2
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] Improve message threading
2014-05-26 14:05 [PATCH 0/2] Improve message threading Michal Sojka
` (2 preceding siblings ...)
2014-08-10 19:49 ` [PATCH 0/2] Improve message threading Tomi Ollila
@ 2014-08-17 0:49 ` David Bremner
3 siblings, 0 replies; 6+ messages in thread
From: David Bremner @ 2014-08-17 0:49 UTC (permalink / raw)
To: Michal Sojka, notmuch
Michal Sojka <sojkam1@fel.cvut.cz> writes:
> Hi all,
>
> I noticed that in certain situations notmuch does not reconstruct
> thread structure correctly. The fix was quite trivial - see the
> patches.
>
> The patch fixes my problem, however, as I learnt from the thread
> around id:87oazwjq1e.fsf@yoom.home.cworth.org, there are more problems
> in this code. I think that my problem is orthogonal to what was
> discussed there.
>
> The patch does not break any existing text (except two in
> T460-emacs-tree.sh, which fail even with 0.18 on my system).
Pushed,
d
^ permalink raw reply [flat|nested] 6+ messages in thread