unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Improve message threading
@ 2014-05-26 14:05 Michal Sojka
  2014-05-26 14:05 ` [PATCH 1/2] Add test for incorrect threading of messages Michal Sojka
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Michal Sojka @ 2014-05-26 14:05 UTC (permalink / raw)
  To: notmuch

Hi all,

I noticed that in certain situations notmuch does not reconstruct
thread structure correctly. The fix was quite trivial - see the
patches.

The patch fixes my problem, however, as I learnt from the thread
around id:87oazwjq1e.fsf@yoom.home.cworth.org, there are more problems
in this code. I think that my problem is orthogonal to what was
discussed there.

The patch does not break any existing text (except two in
T460-emacs-tree.sh, which fail even with 0.18 on my system).

Cheers,
-Michal

Michal Sojka (2):
  Add test for incorrect threading of messages
  Make parsing of References and In-Reply-To header less error prone

 lib/database.cc             | 15 ++++++---------
 test/T510-thread-replies.sh | 27 +++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 9 deletions(-)

-- 
2.0.0.rc2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] Add test for incorrect threading of messages
  2014-05-26 14:05 [PATCH 0/2] Improve message threading Michal Sojka
@ 2014-05-26 14:05 ` Michal Sojka
  2014-05-26 14:05 ` [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone Michal Sojka
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Michal Sojka @ 2014-05-26 14:05 UTC (permalink / raw)
  To: notmuch

This happens when there is some garbage after the last Message-ID in
the References header. See for example
https://lkml.org/lkml/headers/2014/5/19/864.
---
 test/T510-thread-replies.sh | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/test/T510-thread-replies.sh b/test/T510-thread-replies.sh
index eeb70d0..d818b89 100755
--- a/test/T510-thread-replies.sh
+++ b/test/T510-thread-replies.sh
@@ -137,5 +137,33 @@ expected='[[[{"id": "foo@four.com", "match": true, "excluded": false,
 expected=`echo "$expected" | notmuch_json_show_sanitize`
 test_expect_equal_json "$output" "$expected"
 
+test_begin_subtest "Ignore garbage at the end of References"
+test_subtest_known_broken
+add_message '[id]="foo@five.com"' \
+    '[subject]="five"'
+add_message '[id]="bar@five.com"' \
+    '[references]="<foo@five.com> (garbage)"' \
+    '[subject]="not-five"'
+output=$(notmuch show --format=json 'subject:five' | notmuch_json_show_sanitize)
+expected='[[[{"id": "XXXXX", "match": true, "excluded": false,
+ "filename": "YYYYY", "timestamp": 42, "date_relative": "2001-01-05",
+ "tags": ["inbox", "unread"], "headers": {"Subject": "five",
+ "From": "Notmuch Test Suite <test_suite@notmuchmail.org>",
+ "To": "Notmuch Test Suite <test_suite@notmuchmail.org>",
+ "Date": "GENERATED_DATE"}, "body": [{"id": 1,
+ "content-type": "text/plain",
+ "content": "This is just a test message (#10)\n"}]},
+ [[{"id": "XXXXX", "match": true, "excluded": false,
+ "filename": "YYYYY", "timestamp": 42, "date_relative": "2001-01-05",
+ "tags": ["inbox", "unread"],
+ "headers": {"Subject": "not-five",
+ "From": "Notmuch Test Suite <test_suite@notmuchmail.org>",
+ "To": "Notmuch Test Suite <test_suite@notmuchmail.org>",
+ "Date": "GENERATED_DATE"},
+ "body": [{"id": 1, "content-type": "text/plain",
+ "content": "This is just a test message (#11)\n"}]}, []]]]]]'
+expected=`echo "$expected" | notmuch_json_show_sanitize`
+test_expect_equal_json "$output" "$expected"
+
 
 test_done
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone
  2014-05-26 14:05 [PATCH 0/2] Improve message threading Michal Sojka
  2014-05-26 14:05 ` [PATCH 1/2] Add test for incorrect threading of messages Michal Sojka
@ 2014-05-26 14:05 ` Michal Sojka
  2014-07-13 14:54   ` David Bremner
  2014-08-10 19:49 ` [PATCH 0/2] Improve message threading Tomi Ollila
  2014-08-17  0:49 ` David Bremner
  3 siblings, 1 reply; 6+ messages in thread
From: Michal Sojka @ 2014-05-26 14:05 UTC (permalink / raw)
  To: notmuch

According to RFC2822 References and In-Reply-To headers are supposed
to contain one or more Message-IDs, however older RFC822 allowed
almost any content. When both References and In-Reply-To headers ends
with something else that a Message-ID (see e.g. [1]), the thread
structure presented by notmuch is incorrect. The reason is that
notmuch treats this case as if the email contained no "replyto"
information (see _notmuch_database_link_message_to_parents).

This patch changes the parse_references() function to return the last
valid Message-ID encountered rather than NULL resulting from the last
hunk of text not being the Message-ID.

[1] https://lkml.org/lkml/headers/2014/5/19/864
---
 lib/database.cc             | 15 ++++++---------
 test/T510-thread-replies.sh |  1 -
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index 1efb14d..26f15fd 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -521,7 +521,7 @@ parse_references (void *ctx,
 		  GHashTable *hash,
 		  const char *refs)
 {
-    char *ref;
+    char *ref, *last_ref = NULL;
 
     if (refs == NULL || *refs == '\0')
 	return NULL;
@@ -529,20 +529,17 @@ parse_references (void *ctx,
     while (*refs) {
 	ref = _parse_message_id (ctx, refs, &refs);
 
-	if (ref && strcmp (ref, message_id))
+	if (ref && strcmp (ref, message_id)) {
 	    g_hash_table_insert (hash, ref, NULL);
+	    last_ref = ref;
+	}
     }
 
     /* The return value of this function is used to add a parent
      * reference to the database.  We should avoid making a message
-     * its own parent, thus the following check.
+     * its own parent, thus the above check.
      */
-
-    if (ref && strcmp(ref, message_id)) {
-	return ref;
-    } else {
-	return NULL;
-    }
+    return last_ref;
 }
 
 notmuch_status_t
diff --git a/test/T510-thread-replies.sh b/test/T510-thread-replies.sh
index d818b89..1392fbe 100755
--- a/test/T510-thread-replies.sh
+++ b/test/T510-thread-replies.sh
@@ -138,7 +138,6 @@ expected=`echo "$expected" | notmuch_json_show_sanitize`
 test_expect_equal_json "$output" "$expected"
 
 test_begin_subtest "Ignore garbage at the end of References"
-test_subtest_known_broken
 add_message '[id]="foo@five.com"' \
     '[subject]="five"'
 add_message '[id]="bar@five.com"' \
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone
  2014-05-26 14:05 ` [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone Michal Sojka
@ 2014-07-13 14:54   ` David Bremner
  0 siblings, 0 replies; 6+ messages in thread
From: David Bremner @ 2014-07-13 14:54 UTC (permalink / raw)
  To: Michal Sojka, notmuch

Michal Sojka <sojkam1@fel.cvut.cz> writes:

> According to RFC2822 References and In-Reply-To headers are supposed
> to contain one or more Message-IDs, however older RFC822 allowed
> almost any content. When both References and In-Reply-To headers ends
> with something else that a Message-ID (see e.g. [1]), the thread
> structure presented by notmuch is incorrect. The reason is that
> notmuch treats this case as if the email contained no "replyto"
> information (see _notmuch_database_link_message_to_parents).
>
> This patch changes the parse_references() function to return the last
> valid Message-ID encountered rather than NULL resulting from the last
> hunk of text not being the Message-ID.

This series looks OK to me.  It does touch something pretty fundamental,
so I'd appreciate a second set of eyes on it.

d

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] Improve message threading
  2014-05-26 14:05 [PATCH 0/2] Improve message threading Michal Sojka
  2014-05-26 14:05 ` [PATCH 1/2] Add test for incorrect threading of messages Michal Sojka
  2014-05-26 14:05 ` [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone Michal Sojka
@ 2014-08-10 19:49 ` Tomi Ollila
  2014-08-17  0:49 ` David Bremner
  3 siblings, 0 replies; 6+ messages in thread
From: Tomi Ollila @ 2014-08-10 19:49 UTC (permalink / raw)
  To: Michal Sojka, notmuch

On Mon, May 26 2014, Michal Sojka <sojkam1@fel.cvut.cz> wrote:

> Hi all,
>
> I noticed that in certain situations notmuch does not reconstruct
> thread structure correctly. The fix was quite trivial - see the
> patches.
>
> The patch fixes my problem, however, as I learnt from the thread
> around id:87oazwjq1e.fsf@yoom.home.cworth.org, there are more problems
> in this code. I think that my problem is orthogonal to what was
> discussed there.
>
> The patch does not break any existing text (except two in
> T460-emacs-tree.sh, which fail even with 0.18 on my system).
>
> Cheers,
> -Michal

This series looks good to me, and patch 2/2 fixes the borken test
introduced in patch 1/2 -- i.e. the fix is clear, test is somewhat
tedious to be parsed by a human ;D

Tomi


>
> Michal Sojka (2):
>   Add test for incorrect threading of messages
>   Make parsing of References and In-Reply-To header less error prone
>
>  lib/database.cc             | 15 ++++++---------
>  test/T510-thread-replies.sh | 27 +++++++++++++++++++++++++++
>  2 files changed, 33 insertions(+), 9 deletions(-)
>
> -- 
> 2.0.0.rc2
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] Improve message threading
  2014-05-26 14:05 [PATCH 0/2] Improve message threading Michal Sojka
                   ` (2 preceding siblings ...)
  2014-08-10 19:49 ` [PATCH 0/2] Improve message threading Tomi Ollila
@ 2014-08-17  0:49 ` David Bremner
  3 siblings, 0 replies; 6+ messages in thread
From: David Bremner @ 2014-08-17  0:49 UTC (permalink / raw)
  To: Michal Sojka, notmuch

Michal Sojka <sojkam1@fel.cvut.cz> writes:

> Hi all,
>
> I noticed that in certain situations notmuch does not reconstruct
> thread structure correctly. The fix was quite trivial - see the
> patches.
>
> The patch fixes my problem, however, as I learnt from the thread
> around id:87oazwjq1e.fsf@yoom.home.cworth.org, there are more problems
> in this code. I think that my problem is orthogonal to what was
> discussed there.
>
> The patch does not break any existing text (except two in
> T460-emacs-tree.sh, which fail even with 0.18 on my system).

Pushed, 

d

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-17  0:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-26 14:05 [PATCH 0/2] Improve message threading Michal Sojka
2014-05-26 14:05 ` [PATCH 1/2] Add test for incorrect threading of messages Michal Sojka
2014-05-26 14:05 ` [PATCH 2/2] Make parsing of References and In-Reply-To header less error prone Michal Sojka
2014-07-13 14:54   ` David Bremner
2014-08-10 19:49 ` [PATCH 0/2] Improve message threading Tomi Ollila
2014-08-17  0:49 ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).