unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Austin Clements <amdragon@MIT.EDU>
To: Jani Nikula <jani@nikula.org>
Cc: notmuch@notmuchmail.org
Subject: Re: Automatic suppression of non-duplicate messages
Date: Sun, 4 Nov 2012 23:28:12 -0500	[thread overview]
Message-ID: <20121105042749.GT15377@mit.edu> (raw)
In-Reply-To: <87390pf14v.fsf@nikula.org>

Quoth Jani Nikula on Nov 05 at 12:34 am:
> On Sat, 03 Nov 2012, David Bremner <david@tethera.net> wrote:
> > Eirik Byrkjeflot Anonsen <eirik@eirikba.org> writes:
> >
> >> That's not what I see.  If I search for a term that only appears in
> >> one of the "copies", none of the copies are included in the search
> >> result.
> >
> > The offending code is at line 1813 of lib/database.cc; the message is
> > only indexed if the message-id is new.
> >
> > It might be sensible to move _notmuch_message_index_file into the other
> > branch of the if, but even if that works fine, something more
> > sophisticated is needed for the call to
> > __notmuch_message_set_header_values; the invariant that each message has
> > a single subject seems reasonable.
> >
> > Offhand I'm not sure of a good method of automatically deciding what is
> > the same message (with e.g. headers and footer text added by a mailing
> > list).
> 
> Assuming there was good method, what would you do with two different
> messages that have the same message id? That is the unique id we use to
> identify messages (which should be fine per RFC 5322 and its
> predecessors; we're talking about messages from broken systems here).
> 
> It might be helpful to have a configuration option similar to new.tags
> that would define the tags to be assigned to messages with duplicate
> message ids. (This could be done in the
> NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID case near line 516 of
> notmuch-new.c). This could be used to assign a "dupe" tag, for example,
> so the user could do whatever they want in the post-new hook or the user
> interface. A sufficiently clever post-new hook could compare the files
> of a message, and drop the tag or add another, as the case may
> be. Surely not a perfect solution, but keeps the implementation simple.

This would also trigger on message flag changes and folder moves
performed outside of notmuch, since notmuch sees those as a duplicate
message ID followed by a deletion.  The only way to do something for
every received message even if it has the same message ID as an
existing message is to do it in whatever delivers mail.  Currently, we
don't have a good story for integrating on-delivery operations with
notmuch.

  reply	other threads:[~2012-11-05  4:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-03 10:17 Automatic suppression of non-duplicate messages Eirik Byrkjeflot Anonsen
2012-11-03 20:53 ` David Bremner
2012-11-04 10:06   ` Eirik Byrkjeflot Anonsen
2012-11-04 22:34   ` Jani Nikula
2012-11-05  4:28     ` Austin Clements [this message]
2012-11-05 15:22     ` Eirik Byrkjeflot Anonsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121105042749.GT15377@mit.edu \
    --to=amdragon@mit.edu \
    --cc=jani@nikula.org \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).