unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* loss of duplicate messages
@ 2010-02-05 16:31 Jameson Rollins
  2010-02-05 17:25 ` Marten Veldthuis
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Jameson Rollins @ 2010-02-05 16:31 UTC (permalink / raw)
  To: Notmuch Mail

[-- Attachment #1: Type: text/plain, Size: 2196 bytes --]

Hey, folks.  I'm noticing a somewhat problematic behavior of notmuch
that I was wondering if anyone could comment on.

I'm noticing that notmuch is either not syncing, or not returning in
searches, duplicate messages that have identical bodies but different
headers.  This comes up when I send messages to lists to which I am
subscribed.  I have copies of my sent mail saved locally, so I generally
have two copies of emails that I send to such lists: the one that I
sent, and the one that I receive back from the list.  Here's an example:

servo:~ 0$ notmuch search subject:"emacs paned UI"
thread:533da424197bb6ba61a42b667d5d8d8f   Wed. 14:12 [2/2] Tad Fisher, Jameson Rollins; [notmuch] Emacs paned UI ()
servo:~ 0$ notmuch count subject:"emacs paned UI"
2
servo:~ 0$ grep -r -i "emacs paned UI" .mail-notmuch/inbox .mail-notmuch/sent
.mail-notmuch/inbox/new/1265078715_3.20270.servo,U=249614,FMD5=7e33429f656f1e6e9d79b29c3f82c57e:2,:Subject: [notmuch] Emacs paned UI
.mail-notmuch/inbox/new/1265224417_0.1998.servo,U=250039,FMD5=7e33429f656f1e6e9d79b29c3f82c57e:2,:Subject: Re: [notmuch] Emacs paned UI
.mail-notmuch/sent/cur/1265224340.M66544P992Q1.servo:2,S:Subject: Re: [notmuch] Emacs paned UI
servo:~ 0$ 

As you can see, notmuch returns two messages matching the search term,
where as a simple grep turns up three, the original message, my
response, and my response from the list, the latter two being exact
duplicates except for the headers.  The message that notmuch is
returning is the one in my sent mail, presumably because it showed up in
the index first, and the second identical one was just dropped.

I'm not exactly sure what the correct behavior is here, but I would
actually like to see my messages sent to the list returned to me.  It's
a way of verifying that they did go to the list, as well as getting a
feeling for the round trip time.  I personally wouldn't mind just seeing
both copies of the message returned by notmuch, as I can just delete one
of them if I don't want it to turn up again.  Would this behavior be
problematic in any way?  Do folks have suggestions of other behaviors
that might get around this problem?

jamie.

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: loss of duplicate messages
  2010-02-05 16:31 loss of duplicate messages Jameson Rollins
@ 2010-02-05 17:25 ` Marten Veldthuis
  2010-02-05 17:49   ` Jameson Graef Rollins
  2010-02-24 21:10 ` Jameson Rollins
  2010-09-16  0:53 ` Rob Browning
  2 siblings, 1 reply; 9+ messages in thread
From: Marten Veldthuis @ 2010-02-05 17:25 UTC (permalink / raw)
  To: Jameson Rollins, Notmuch Mail

On Fri, 05 Feb 2010 11:31:34 -0500, Jameson Rollins <jrollins@finestructure.net> wrote:
> I'm noticing that notmuch is either not syncing, or not returning in
> searches, duplicate messages that have identical bodies but different
> headers.

This is indeed the correct behaviour of notmuch. There has been some
discussion on it in the past, I believe with proposals to track both
messages and show only one; but I don't think I've seen proponents of
showing both duplicate messages.

Personally I'd find it rather annoying if I'd see messages twice. But I
do see the value in being sure that your mail gets delivered through the
list. I believe the solution I've seen discussed was for notmuch to
somehow determine which of the duplicates holds the most information
(which would be the one through the list, not the one directly to you).

On the other hand, enough mailing lists destroy this behaviour, see:
the thread [notmuch] [PATCH (rebased)] Handle message renames in mail
spool, around id:87y6lir37f.fsf@vertex.dottedmag


-- 
- Marten

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: loss of duplicate messages
  2010-02-05 17:25 ` Marten Veldthuis
@ 2010-02-05 17:49   ` Jameson Graef Rollins
  2010-02-05 19:39     ` micah anderson
  0 siblings, 1 reply; 9+ messages in thread
From: Jameson Graef Rollins @ 2010-02-05 17:49 UTC (permalink / raw)
  To: Marten Veldthuis, Notmuch Mail

[-- Attachment #1: Type: text/plain, Size: 1382 bytes --]

On Fri, 05 Feb 2010 18:25:59 +0100, Marten Veldthuis <marten@veldthuis.com> wrote:
> This is indeed the correct behaviour of notmuch. There has been some
> discussion on it in the past, I believe with proposals to track both
> messages and show only one; but I don't think I've seen proponents of
> showing both duplicate messages.
> 
> Personally I'd find it rather annoying if I'd see messages twice. But I
> do see the value in being sure that your mail gets delivered through the
> list. I believe the solution I've seen discussed was for notmuch to
> somehow determine which of the duplicates holds the most information
> (which would be the one through the list, not the one directly to you).

Hey, Marten.  Thanks for the reply.

The problem I have with only returning one of the redundant messages is
that I don't think anyone could ever really convince me that notmuch has
the ability to decide which of the redundant messages is the *right* one
to return.  I think notmuch is currently just returning the first one it
indexes, but why is that better than returning the one most recently
indexed?

A policy of only returning one is going to be problematic for folks who
want or expect to see the other.  And in fact think I want to see both.
I have both, and I've asked notmuch to index both, so why shouldn't it
return both in a search?

jamie.

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: loss of duplicate messages
  2010-02-05 17:49   ` Jameson Graef Rollins
@ 2010-02-05 19:39     ` micah anderson
  2010-02-05 19:47       ` Jameson Rollins
  0 siblings, 1 reply; 9+ messages in thread
From: micah anderson @ 2010-02-05 19:39 UTC (permalink / raw)
  To: Jameson Graef Rollins, Marten Veldthuis, Notmuch Mail

[-- Attachment #1: Type: text/plain, Size: 965 bytes --]

On Fri, 05 Feb 2010 12:49:21 -0500, Jameson Graef Rollins <jrollins@finestructure.net> wrote: 
> A policy of only returning one is going to be problematic for folks who
> want or expect to see the other.  And in fact think I want to see both.
> I have both, and I've asked notmuch to index both, so why shouldn't it
> return both in a search?

Welcome to how gmail does it. When they first hit the scene, as an
operator of a large mailing list service, I was *constantly* being
bugged with support issues from people who were expecting this very
behavior, "I sent a message to the list, but I never got it, did it get
posted to the list?!". Soon I found out that gmail did exactly what
you are reporting notmuch as doing.

The frightening thing is that over the last few years of gmail's
existence, those complaints and support issues have totally gone
away. Does that mean that gmail has trained people to no longer expect
this behavior?

micah

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: loss of duplicate messages
  2010-02-05 19:39     ` micah anderson
@ 2010-02-05 19:47       ` Jameson Rollins
  2010-02-05 21:26         ` micah anderson
  0 siblings, 1 reply; 9+ messages in thread
From: Jameson Rollins @ 2010-02-05 19:47 UTC (permalink / raw)
  To: micah anderson, Marten Veldthuis, Notmuch Mail

[-- Attachment #1: Type: text/plain, Size: 179 bytes --]

On Fri, 05 Feb 2010 14:39:22 -0500, micah anderson <micah@riseup.net> wrote:
> Welcome to how gmail does it.

Why welcome me to "how gmail does it"?!  I never wanted to go there!

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: loss of duplicate messages
  2010-02-05 19:47       ` Jameson Rollins
@ 2010-02-05 21:26         ` micah anderson
  0 siblings, 0 replies; 9+ messages in thread
From: micah anderson @ 2010-02-05 21:26 UTC (permalink / raw)
  To: Jameson Rollins, Marten Veldthuis, Notmuch Mail

[-- Attachment #1: Type: text/plain, Size: 1057 bytes --]

On Fri, 05 Feb 2010 14:47:03 -0500, Jameson Rollins <jrollins@finestructure.net> wrote:
> On Fri, 05 Feb 2010 14:39:22 -0500, micah anderson <micah@riseup.net> wrote:
> > Welcome to how gmail does it.
> 
> Why welcome me to "how gmail does it"?!  I never wanted to go there!

Because the perception of the internet by the third larges mail provider
on the internet, with over 146 million users[0] is not a trivial number
that can be ignored.

Maybe I should have said, "welcome to the place where you do not wish to
tread, the place where 1/3rd of the people using the internet have their
perceptions shaped. be careful, there is a lot of stuff here you do not
want to step in!"

micah


0. Arrington, Michael (2009-07-09). "Bing Comes to Hotmail". Techcrunch. http://www.techcrunch.com/2009/07/09/bing-comes-to-hotmail/. Retrieved 2009-07-11. "Hotmail is still by far the largest web mail provider on the Internet, with 343 million monthly users according to Comscore. Second and third are Yahoo (285 million) and Gmail (146 million)." 

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: loss of duplicate messages
  2010-02-05 16:31 loss of duplicate messages Jameson Rollins
  2010-02-05 17:25 ` Marten Veldthuis
@ 2010-02-24 21:10 ` Jameson Rollins
  2010-09-16  0:53 ` Rob Browning
  2 siblings, 0 replies; 9+ messages in thread
From: Jameson Rollins @ 2010-02-24 21:10 UTC (permalink / raw)
  To: Notmuch Mail

[-- Attachment #1: Type: text/plain, Size: 838 bytes --]

Hi, folks.  I'm continuing to find it problematic that duplicate
messages are only appearing to be indexed once.  I'm wondering what are
the possbile solutions to this issue, if any.

For instance, I sent a message that was bcc'd to a long list of people,
including myself.  All of my sent mail is fcc'd to a local directory.
The index of my original sent message was almost immediately trumped by
the index of the message I then recieved back through the mail.  Since
the recieved message trumped the fcc'd message, I had a difficult time
finding the bcc list, since I had to go outside of notmuch.

Is there no way to make it possible to have notmuch returns all copies
of messages matching searches?  I don't think this happens frequently,
so I don't think it would get in the way, but it's certainly useful for
me to have it.

jamie.

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: loss of duplicate messages
  2010-02-05 16:31 loss of duplicate messages Jameson Rollins
  2010-02-05 17:25 ` Marten Veldthuis
  2010-02-24 21:10 ` Jameson Rollins
@ 2010-09-16  0:53 ` Rob Browning
  2010-11-12  1:31   ` Carl Worth
  2 siblings, 1 reply; 9+ messages in thread
From: Rob Browning @ 2010-09-16  0:53 UTC (permalink / raw)
  To: notmuch

jrollins at finestructure.net (Jameson Rollins) writes:

> I'm not exactly sure what the correct behavior is here, but I would
> actually like to see my messages sent to the list returned to me.
> It's a way of verifying that they did go to the list, as well as
> getting a feeling for the round trip time.  I personally wouldn't mind
> just seeing both copies of the message returned by notmuch, as I can
> just delete one of them if I don't want it to turn up again.  Would
> this behavior be problematic in any way?  Do folks have suggestions of
> other behaviors that might get around this problem?

I'm not sure what the current plan is, but please consider this a
belated agreement.  It doesn't necessarily need to be the default (and
perhaps shouldn't be), but I'd like to have some way to ask notmuch for
*all* matching messages (regardless of message id) -- perhaps via an
--include-duplicates argument to search/show/count, etc.

Thanks
-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: loss of duplicate messages
  2010-09-16  0:53 ` Rob Browning
@ 2010-11-12  1:31   ` Carl Worth
  0 siblings, 0 replies; 9+ messages in thread
From: Carl Worth @ 2010-11-12  1:31 UTC (permalink / raw)
  To: Rob Browning, notmuch

[-- Attachment #1: Type: text/plain, Size: 629 bytes --]

On Wed, 15 Sep 2010 19:53:41 -0500, Rob Browning <rlb@defaultvalue.org> wrote:
> I'm not sure what the current plan is, but please consider this a
> belated agreement.  It doesn't necessarily need to be the default (and
> perhaps shouldn't be), but I'd like to have some way to ask notmuch for
> *all* matching messages (regardless of message id) -- perhaps via an
> --include-duplicates argument to search/show/count, etc.

As mentioned recently, the new notmuch_message_get_filenames function
will make it quite easy to implement the above.

Any contribution will be welcome.

-Carl

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-11-12  1:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-05 16:31 loss of duplicate messages Jameson Rollins
2010-02-05 17:25 ` Marten Veldthuis
2010-02-05 17:49   ` Jameson Graef Rollins
2010-02-05 19:39     ` micah anderson
2010-02-05 19:47       ` Jameson Rollins
2010-02-05 21:26         ` micah anderson
2010-02-24 21:10 ` Jameson Rollins
2010-09-16  0:53 ` Rob Browning
2010-11-12  1:31   ` Carl Worth

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).