unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Possible some threads are not complete due to bug?
@ 2015-09-13  4:03 Xu Wang
  2015-09-13  6:19 ` Suvayu Ali
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Xu Wang @ 2015-09-13  4:03 UTC (permalink / raw)
  To: notmuch

Hi,

Sometimes I need to do:

$ notmuch search --output=threads "id:MYMSGID"
thread:000000000000a125
$ notmuch search --output=messages "thread:000000000000a125"

In theory, this should output the message that responded to message ID
"MYMSGID". Sometimes it works. But sometimes it does not work. That
is, there exists an email where I am sure (I checked the raw email)
that there is a header
In-Reply-To: <MYMSGID>
but that email does not show when I do the two commands above.
Indeed, that mail belongs to a different thread ID.

I am just curious if the above is due to:

1. My missing of understanding of how notmuch deals with threads
2. A bug or missing feature in notmuch causes some threads to be incomplete

By the way, my goal is to use Suvayu's script to check if a message
was responded to.

Kind regards,

Xu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible some threads are not complete due to bug?
  2015-09-13  4:03 Possible some threads are not complete due to bug? Xu Wang
@ 2015-09-13  6:19 ` Suvayu Ali
  2015-09-13 15:10   ` Xu Wang
  2015-09-19 15:45 ` [PATCH] test: add sanity tests for threading David Bremner
  2015-09-19 16:02 ` Possible some threads are not complete due to bug? David Bremner
  2 siblings, 1 reply; 11+ messages in thread
From: Suvayu Ali @ 2015-09-13  6:19 UTC (permalink / raw)
  To: notmuch

Hi,

You should include a reference to the original message, not everyone
will remember the thread.

  id:20150614082258.GD17381@chitra.no-ip.org or
  <http://mid.gmane.org/20150614082258.GD17381@chitra.no-ip.org>

On Sun, Sep 13, 2015 at 12:03:20AM -0400, Xu Wang wrote:
> 
> Sometimes I need to do:
> 
> $ notmuch search --output=threads "id:MYMSGID"
> thread:000000000000a125
> $ notmuch search --output=messages "thread:000000000000a125"

Looking at the script again, I see I assumed a message will belong to a
single thread.  You can remove that assumption by applying the following
change.

-----8<--------------------8<-----
diff -u nm-ack nm-ack
--- nm-ack	2015-06-15 01:30:40.327556510 +0200
+++ nm-ack	2015-09-13 07:58:30.734096931 +0200
@@ -10,8 +10,9 @@
 # debug
 # set -o xtrace
 
-declare query="$1" thread=$(notmuch search --output=threads -- "$1")
-declare -a msgs=$(notmuch search --output=messages -- "$thread") responses
+declare query="$1"
+declare -a thread=$(notmuch search --output=threads -- "$1")
+declare -a msgs=$(notmuch search --output=messages -- "${thread[@]}") responses
 
 function strip_mid() {
     sed -e 's/ \+//g' -e 's/<\([^ <>]\+\)>/\1/g'
----->8-------------------->8-----

> In theory, this should output the message that responded to message ID
> "MYMSGID". Sometimes it works. But sometimes it does not work. That
> is, there exists an email where I am sure (I checked the raw email)
> that there is a header
> In-Reply-To: <MYMSGID>
> but that email does not show when I do the two commands above.
> Indeed, that mail belongs to a different thread ID.
> 
> I am just curious if the above is due to:
> 
> 1. My missing of understanding of how notmuch deals with threads
> 2. A bug or missing feature in notmuch causes some threads to be incomplete

Interesting issue.  I can think of a case, say a message is cross-posted
to multiple lists, it might then give you more than one thread ids.  Is
this the case for your message?  If you are up for it, look in
lib/thread.cc.  I think the relevant methods are:
_resolve_thread_relationships and _notmuch_thread_create, but I could be
wrong.  I'm not familiar with the notmuch source.

As I recall, you are using mutt-kz; does <entire-thread> work from
mutt-kz?  I would expect that to fail too.  It gets the thread id like
this:

  id = notmuch_message_get_thread_id(msg);

Hope this helps,

-- 
Suvayu

Open source is the future. It sets us free.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible some threads are not complete due to bug?
  2015-09-13  6:19 ` Suvayu Ali
@ 2015-09-13 15:10   ` Xu Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Xu Wang @ 2015-09-13 15:10 UTC (permalink / raw)
  To: notmuch

On Sun, Sep 13, 2015 at 2:19 AM, Suvayu Ali <fatkasuvayu+linux@gmail.com> wrote:
> Hi,
>
> You should include a reference to the original message, not everyone
> will remember the thread.
>
>   id:20150614082258.GD17381@chitra.no-ip.org or
>   <http://mid.gmane.org/20150614082258.GD17381@chitra.no-ip.org>

Ah yes thank you.

> On Sun, Sep 13, 2015 at 12:03:20AM -0400, Xu Wang wrote:
>>
>> Sometimes I need to do:
>>
>> $ notmuch search --output=threads "id:MYMSGID"
>> thread:000000000000a125
>> $ notmuch search --output=messages "thread:000000000000a125"
>
> Looking at the script again, I see I assumed a message will belong to a
> single thread.  You can remove that assumption by applying the following
> change.
>
> -----8<--------------------8<-----
> diff -u nm-ack nm-ack
> --- nm-ack      2015-06-15 01:30:40.327556510 +0200
> +++ nm-ack      2015-09-13 07:58:30.734096931 +0200
> @@ -10,8 +10,9 @@
>  # debug
>  # set -o xtrace
>
> -declare query="$1" thread=$(notmuch search --output=threads -- "$1")
> -declare -a msgs=$(notmuch search --output=messages -- "$thread") responses
> +declare query="$1"
> +declare -a thread=$(notmuch search --output=threads -- "$1")
> +declare -a msgs=$(notmuch search --output=messages -- "${thread[@]}") responses
>
>  function strip_mid() {
>      sed -e 's/ \+//g' -e 's/<\([^ <>]\+\)>/\1/g'
> ----->8-------------------->8-----
>
>> In theory, this should output the message that responded to message ID
>> "MYMSGID". Sometimes it works. But sometimes it does not work. That
>> is, there exists an email where I am sure (I checked the raw email)
>> that there is a header
>> In-Reply-To: <MYMSGID>
>> but that email does not show when I do the two commands above.
>> Indeed, that mail belongs to a different thread ID.
>>
>> I am just curious if the above is due to:
>>
>> 1. My missing of understanding of how notmuch deals with threads
>> 2. A bug or missing feature in notmuch causes some threads to be incomplete
>
> Interesting issue.  I can think of a case, say a message is cross-posted
> to multiple lists, it might then give you more than one thread ids.  Is
> this the case for your message?  If you are up for it, look in
> lib/thread.cc.  I think the relevant methods are:
> _resolve_thread_relationships and _notmuch_thread_create, but I could be
> wrong.  I'm not familiar with the notmuch source.
>
> As I recall, you are using mutt-kz; does <entire-thread> work from
> mutt-kz?  I would expect that to fail too.  It gets the thread id like
> this:
>
>   id = notmuch_message_get_thread_id(msg);
>
> Hope this helps,

Thanks so very much as always Suvayu. I learn a great amount of
knowledge from you.
I tried the fixes and nothing changed. The message IDs had different
threads. <entire-thread> did not work in mutt-kz. However, when I
viewed [all mail] (from Gmail) so that the messages were in the same
mailbox, mutt recognized they were a thread (in the sense of the
down-right arrow signifying a reply).

In the end, I removed the .notmuch folder and ran 'notmuch new' to
regenerate everything. This worked! However I am wondering why I had
to do that. Is there any case where doing a complete refresh should be
done as opposed to "notmuch new" (which I do all the time)?
It appears that thread detection works different when adding messages
as opposed to a complete refresh?

Kind regards,

Xu

> --
> Suvayu
>
> Open source is the future. It sets us free.
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] test: add sanity tests for threading
  2015-09-13  4:03 Possible some threads are not complete due to bug? Xu Wang
  2015-09-13  6:19 ` Suvayu Ali
@ 2015-09-19 15:45 ` David Bremner
  2015-11-23 12:54   ` David Bremner
  2015-09-19 16:02 ` Possible some threads are not complete due to bug? David Bremner
  2 siblings, 1 reply; 11+ messages in thread
From: David Bremner @ 2015-09-19 15:45 UTC (permalink / raw)
  To: notmuch

These tests are inspired by a problem report

      id:CAJhTkNh7_hXDLsAGyD7nwkXV4ca6ymkLtFG945USvfqK4ZJEdQ@mail.gmail.com

Of course I can't duplicate the mentioned problem, it probably depends
on specific message data.
---
 test/T580-thread-search.sh | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)
 create mode 100755 test/T580-thread-search.sh

diff --git a/test/T580-thread-search.sh b/test/T580-thread-search.sh
new file mode 100755
index 0000000..6f7106d
--- /dev/null
+++ b/test/T580-thread-search.sh
@@ -0,0 +1,42 @@
+#!/usr/bin/env bash
+#
+# Copyright (c) 2015 David Bremner
+#
+
+test_description='test of searching by thread-id'
+
+. ./test-lib.sh || exit 1
+
+add_email_corpus
+
+test_begin_subtest "Every message is found in exactly one thread"
+
+count=0
+success=0
+for id in $(notmuch search --output=messages '*'); do
+    count=$((count +1))
+    matches=$(notmuch search --output=threads "$id" | wc -l)
+    if [ "$matches" = 1 ]; then
+	success=$((success + 1))
+    fi
+done
+
+test_expect_equal "$count" "$success"
+
+test_begin_subtest "roundtripping message-ids via thread-ids"
+
+count=0
+success=0
+for id in $(notmuch search --output=messages '*'); do
+    count=$((count +1))
+    thread=$(notmuch search --output=threads "$id")
+    matched=$(notmuch search --output=messages "$thread" | grep "$id")
+    if [ "$matched" = "$id" ]; then
+	success=$((success + 1))
+    fi
+done
+
+test_expect_equal "$count" "$success"
+
+
+test_done
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Possible some threads are not complete due to bug?
  2015-09-13  4:03 Possible some threads are not complete due to bug? Xu Wang
  2015-09-13  6:19 ` Suvayu Ali
  2015-09-19 15:45 ` [PATCH] test: add sanity tests for threading David Bremner
@ 2015-09-19 16:02 ` David Bremner
  2015-09-20 14:33   ` Xu Wang
  2 siblings, 1 reply; 11+ messages in thread
From: David Bremner @ 2015-09-19 16:02 UTC (permalink / raw)
  To: Xu Wang, notmuch

Xu Wang <xuwang762@gmail.com> writes:

> Sometimes I need to do:
>
> $ notmuch search --output=threads "id:MYMSGID"
> thread:000000000000a125
> $ notmuch search --output=messages "thread:000000000000a125"
>
> In theory, this should output the message that responded to message ID
> "MYMSGID". Sometimes it works. But sometimes it does not work. That
> is, there exists an email where I am sure (I checked the raw email)
> that there is a header
> In-Reply-To: <MYMSGID>
> but that email does not show when I do the two commands above.

I'm not 100% sure what you mean by "does not show". Do you mean that the
output from the second command is empty? or does not contain MYMSGID?

> Indeed, that mail belongs to a different thread ID.

How can you tell that the mail belongs to a different thread?
Isn't that what your first command tells you?

d

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible some threads are not complete due to bug?
  2015-09-19 16:02 ` Possible some threads are not complete due to bug? David Bremner
@ 2015-09-20 14:33   ` Xu Wang
  2015-09-23 22:44     ` David Bremner
  0 siblings, 1 reply; 11+ messages in thread
From: Xu Wang @ 2015-09-20 14:33 UTC (permalink / raw)
  To: David Bremner; +Cc: notmuch

On Sat, Sep 19, 2015 at 12:02 PM, David Bremner <david@tethera.net> wrote:
> Xu Wang <xuwang762@gmail.com> writes:
>
>> Sometimes I need to do:
>>
>> $ notmuch search --output=threads "id:MYMSGID"
>> thread:000000000000a125
>> $ notmuch search --output=messages "thread:000000000000a125"
>>
>> In theory, this should output the message that responded to message ID
>> "MYMSGID". Sometimes it works. But sometimes it does not work. That
>> is, there exists an email where I am sure (I checked the raw email)
>> that there is a header
>> In-Reply-To: <MYMSGID>
>> but that email does not show when I do the two commands above.
>
> I'm not 100% sure what you mean by "does not show". Do you mean that the
> output from the second command is empty? or does not contain MYMSGID?

That was poor formatting on my part. It only was outputting the id I
put into the first command. So it only showed
id:MYMSGID

>> Indeed, that mail belongs to a different thread ID.
>
> How can you tell that the mail belongs to a different thread?

Because when I do a notmuch search for that message by message ID with
--output=threads, it gives a different thread ID.

> Isn't that what your first command tells you?
Well yes, but I wanted to confirm that that is indeed the reason. I
have heard in someone saying that debugging is process of confirming
things that should be true.

Kind regards,

Xu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible some threads are not complete due to bug?
  2015-09-20 14:33   ` Xu Wang
@ 2015-09-23 22:44     ` David Bremner
  2015-09-24  0:55       ` Xu Wang
  0 siblings, 1 reply; 11+ messages in thread
From: David Bremner @ 2015-09-23 22:44 UTC (permalink / raw)
  To: Xu Wang; +Cc: notmuch

Xu Wang <xuwang762@gmail.com> writes:

> Because when I do a notmuch search for that message by message ID with
> --output=threads, it gives a different thread ID.
>
>> Isn't that what your first command tells you?
> Well yes, but I wanted to confirm that that is indeed the reason. I
> have heard in someone saying that debugging is process of confirming
> things that should be true.

I'm still confused. It sounds like you gave the same command twice and
got different answers. Can you maybe show the complete sequence of
commands and output (assuming it's not too large)?

d

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible some threads are not complete due to bug?
  2015-09-23 22:44     ` David Bremner
@ 2015-09-24  0:55       ` Xu Wang
  2015-10-04 10:57         ` David Bremner
  0 siblings, 1 reply; 11+ messages in thread
From: Xu Wang @ 2015-09-24  0:55 UTC (permalink / raw)
  To: David Bremner; +Cc: notmuch

On Wed, Sep 23, 2015 at 6:44 PM, David Bremner <david@tethera.net> wrote:
> Xu Wang <xuwang762@gmail.com> writes:
>
>> Because when I do a notmuch search for that message by message ID with
>> --output=threads, it gives a different thread ID.
>>
>>> Isn't that what your first command tells you?
>> Well yes, but I wanted to confirm that that is indeed the reason. I
>> have heard in someone saying that debugging is process of confirming
>> things that should be true.
>
> I'm still confused. It sounds like you gave the same command twice and
> got different answers. Can you maybe show the complete sequence of
> commands and output (assuming it's not too large)?

Dear David, thank you for your kind persistence. I apologize for the
lack of clarity in my writing. Below is the full sequence of commands
(I just subsitute MYMSGID and MYMSGIDREPLY for two message IDs) as
well as my thought process.

$ notmuch search --output=threads "id:MYMSGID"
thread:000000000000a125
$ notmuch search --output=messages "thread:000000000000a125"
id:MYMSGID
$
# I know that MYMSGIDREPLY did respond to that message. I have it in
my mutt mailbox and it shows the down-right arrow signifying this. I
inspect the headers and there is indeed a header in MYMSGIDREPLY that
says "In-Reply-To: <MYMSGID>". I then do...
$ notmuch search --output=threads "id:MYMSGIDREPLY"
thread:000000000000c125
$ notmuch search --output=messages "thread:000000000000c125"
id:MYMSGIDREPLY
$

# What I expected (and did not get) was the following output:
$ notmuch search --output=threads "id:MYMSGID"
thread:000000000000a125
$ notmuch search --output=messages "thread:000000000000a125"
id:MYMSGID
id:MYMSGIDREPLY
$

# after regenerating the notmuch database, I did indeed get the output
that I expected.

Kind regards,

Xu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible some threads are not complete due to bug?
  2015-09-24  0:55       ` Xu Wang
@ 2015-10-04 10:57         ` David Bremner
  2015-10-05  3:52           ` Xu Wang
  0 siblings, 1 reply; 11+ messages in thread
From: David Bremner @ 2015-10-04 10:57 UTC (permalink / raw)
  To: Xu Wang; +Cc: notmuch

Xu Wang <xuwang762@gmail.com> writes:

>
> $ notmuch search --output=threads "id:MYMSGID"
> thread:000000000000a125
> $ notmuch search --output=messages "thread:000000000000a125"
> id:MYMSGID
> $
> # I know that MYMSGIDREPLY did respond to that message. I have it in
> my mutt mailbox and it shows the down-right arrow signifying this. I
> inspect the headers and there is indeed a header in MYMSGIDREPLY that
> says "In-Reply-To: <MYMSGID>". I then do...
> $ notmuch search --output=threads "id:MYMSGIDREPLY"
> thread:000000000000c125
> $ notmuch search --output=messages "thread:000000000000c125"
> id:MYMSGIDREPLY
> $

If the thread-id's are accurate, then it looks like the two messages are
not in the same thread according to notmuch (it's easy to be fooled
because the thread-ids are so similar).  I can't really explain how
those messages might have ended up in different threads. 

  - One potential issue is that if message ids are extra long or badly
  formed, then notmuch might make up a new message id. In that case your
  thread-id search wouldn't work at all.

  - If there are actually multiple (unrelated) files with message-id
    MYMSGIDREPLY, then the indexed one might not have the in-reply-to
    header. But in this case you could tell by

    notmuch show id:MYMSGIDREPLY

    and/or

    notmuch search --output=files id:MSGIDREPLY

In order for the thread-ids to change when you run "notmuch new", I
_think_ that there has to be a third message in the thread disovered.

So it's a mystery. If it happens again with public messages, it would be
worth sharing the messages (as attachements) with the list, just in case
there is something in the headers that explains it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible some threads are not complete due to bug?
  2015-10-04 10:57         ` David Bremner
@ 2015-10-05  3:52           ` Xu Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Xu Wang @ 2015-10-05  3:52 UTC (permalink / raw)
  To: David Bremner; +Cc: notmuch

On Sun, Oct 4, 2015 at 6:57 AM, David Bremner <david@tethera.net> wrote:
> Xu Wang <xuwang762@gmail.com> writes:
>
>>
>> $ notmuch search --output=threads "id:MYMSGID"
>> thread:000000000000a125
>> $ notmuch search --output=messages "thread:000000000000a125"
>> id:MYMSGID
>> $
>> # I know that MYMSGIDREPLY did respond to that message. I have it in
>> my mutt mailbox and it shows the down-right arrow signifying this. I
>> inspect the headers and there is indeed a header in MYMSGIDREPLY that
>> says "In-Reply-To: <MYMSGID>". I then do...
>> $ notmuch search --output=threads "id:MYMSGIDREPLY"
>> thread:000000000000c125
>> $ notmuch search --output=messages "thread:000000000000c125"
>> id:MYMSGIDREPLY
>> $
>
> If the thread-id's are accurate, then it looks like the two messages are
> not in the same thread according to notmuch (it's easy to be fooled
> because the thread-ids are so similar).  I can't really explain how
> those messages might have ended up in different threads.
>
>   - One potential issue is that if message ids are extra long or badly
>   formed, then notmuch might make up a new message id. In that case your
>   thread-id search wouldn't work at all.
>
>   - If there are actually multiple (unrelated) files with message-id
>     MYMSGIDREPLY, then the indexed one might not have the in-reply-to
>     header. But in this case you could tell by
>
>     notmuch show id:MYMSGIDREPLY
>
>     and/or
>
>     notmuch search --output=files id:MSGIDREPLY
>
> In order for the thread-ids to change when you run "notmuch new", I
> _think_ that there has to be a third message in the thread disovered.
>
> So it's a mystery. If it happens again with public messages, it would be
> worth sharing the messages (as attachements) with the list, just in case
> there is something in the headers that explains it.
>

OK I will be careful to document if I find a repeatable example and
share with the list. I would like to help in any possible way that I
am capable.

Kind regards,

Xu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] test: add sanity tests for threading
  2015-09-19 15:45 ` [PATCH] test: add sanity tests for threading David Bremner
@ 2015-11-23 12:54   ` David Bremner
  0 siblings, 0 replies; 11+ messages in thread
From: David Bremner @ 2015-11-23 12:54 UTC (permalink / raw)
  To: notmuch

David Bremner <david@tethera.net> writes:

> These tests are inspired by a problem report
>
>       id:CAJhTkNh7_hXDLsAGyD7nwkXV4ca6ymkLtFG945USvfqK4ZJEdQ@mail.gmail.com
>
> Of course I can't duplicate the mentioned problem, it probably depends
> on specific message data.

applied to master

d

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-11-23 12:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-13  4:03 Possible some threads are not complete due to bug? Xu Wang
2015-09-13  6:19 ` Suvayu Ali
2015-09-13 15:10   ` Xu Wang
2015-09-19 15:45 ` [PATCH] test: add sanity tests for threading David Bremner
2015-11-23 12:54   ` David Bremner
2015-09-19 16:02 ` Possible some threads are not complete due to bug? David Bremner
2015-09-20 14:33   ` Xu Wang
2015-09-23 22:44     ` David Bremner
2015-09-24  0:55       ` Xu Wang
2015-10-04 10:57         ` David Bremner
2015-10-05  3:52           ` Xu Wang

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).